Skip to content

Commit

Permalink
Implements attn_logits_soft_cap and pass it through multi_queries_pag…
Browse files Browse the repository at this point in the history
…ed_attention
  • Loading branch information
fenghuizhang committed Jan 19, 2025
1 parent 68cd431 commit 491dbdb
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -271,7 +271,7 @@ def paged_flash_attention_kernel(
num_kv_pages_per_compute_block: int,
mask_value: float,
query_len: int,
attn_logits_soft_cap: float | None = None,
attn_logits_soft_cap: float | None,
):
"""Pallas kernel for paged attention."""
b, kv_head_idx, q_blk_idx, kv_blk_idx = (
Expand Down

0 comments on commit 491dbdb

Please sign in to comment.