cuDNN SDPA executor does not support batched broadcast across attention heads mask #1799

IvanYashchuk · 2025-02-25T09:43:30Z

🐛 Bug

To reproduce:

Pull the branch of Update cuDNN SDPA execution functions to use the strides of attn_mask #1798 if not merged yet.

Run the test added in Update cuDNN SDPA execution functions to use the strides of attn_mask #1798 (pytest thunder/tests/test_cudnn_executor.py::test_make_cudnn_sdpa_backward_graph_with_mask):

lightning-thunder/thunder/tests/test_cudnn_executor.py

Lines 65 to 72 in 839d65f

    
           # "B1TS" case is not working 
        
           mask = torch.ones(batch_size, 1, sequence_length, sequence_length, dtype=torch.bool, device="cuda") 
        
           mask = torch.where(mask, 0.0, float("-inf")).to(torch.bfloat16) 
        
           # Reason: dbias cannot be reduced just across heads. 
        
           # Supported reduction patterns are [1,1,T,S], [1,H,T,S], [B,H,T,S] at: (!batch_dim_reduction_requested) && head_dim_reduction_requested 
        
           # See issue: ... 
        
           with pytest.raises(cudnn.cudnnGraphNotSupportedError, match="No valid engine configs"): 
        
               _make_cudnn_sdpa_backward_graph(q, k, v, mask, 0.0, None)

Supporting this case is required for batched packed sequences (#1758).

The text was updated successfully, but these errors were encountered:

IvanYashchuk added the cudnn label Feb 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuDNN SDPA executor does not support batched broadcast across attention heads mask #1799

cuDNN SDPA executor does not support batched broadcast across attention heads mask #1799

IvanYashchuk commented Feb 25, 2025 •

edited

Loading

cuDNN SDPA executor does not support batched broadcast across attention heads mask #1799

cuDNN SDPA executor does not support batched broadcast across attention heads mask #1799

Comments

IvanYashchuk commented Feb 25, 2025 • edited Loading

🐛 Bug

IvanYashchuk commented Feb 25, 2025 •

edited

Loading