Feature request: Add Dropout #7

hypnopump · 2023-12-18T09:18:40Z

The pytorch base implementation of scaled_dot_product_attention provides dropout as an arg. Fusing it into the triton kernel would replicate that functionality, as dropout is applied to the attention scores, not the output.

In the CUDA version, it is supported here
There have been attempts at integrating into triton before

The text was updated successfully, but these errors were encountered:

iclementine · 2023-12-27T02:29:55Z

Yes. It is on our plan. Actually WIP.

Since triton provides pseudo-random generator now we can implement an memory efficient flash attention with dropout without having to save the dropout mask(since it requires O(n^2) memory). The essence is to re-generate the same dropout masking as is used in the forward pass.

hypnopump · 2023-12-27T15:28:27Z

I have implemented a prototype which seems to work here but it's hard to test correctness without separately implementing the dropout layer and checking, as uses a different random seed than torch.

iclementine · 2023-12-28T03:06:20Z

Yes, testing for randomness is tricky. There is no proper all_close test for operators with randomness. But I think dropout can be testing in the following ways:

re-generation of mask;
distribution test of mask.

This method can be applied to a separate dropout operator. It can also be applied to the dropout part of a more-complex-operator, but the overall testing for correctness is more complicated then operators without randomness involved.

iclementine · 2024-06-05T09:42:32Z

finally done in #23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Add Dropout #7

Feature request: Add Dropout #7

hypnopump commented Dec 18, 2023 •

edited

Loading

iclementine commented Dec 27, 2023 •

edited

Loading

hypnopump commented Dec 27, 2023

iclementine commented Dec 28, 2023 •

edited

Loading

iclementine commented Jun 5, 2024

Feature request: Add Dropout #7

Feature request: Add Dropout #7

Comments

hypnopump commented Dec 18, 2023 • edited Loading

iclementine commented Dec 27, 2023 • edited Loading

hypnopump commented Dec 27, 2023

iclementine commented Dec 28, 2023 • edited Loading

iclementine commented Jun 5, 2024

hypnopump commented Dec 18, 2023 •

edited

Loading

iclementine commented Dec 27, 2023 •

edited

Loading

iclementine commented Dec 28, 2023 •

edited

Loading