-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flash attention for GPUs like in maxtext #149
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ksikiric left some comments. Overall looks good. Last comment is to remove the training part of the code out of this PR and we can work on it on the Flux training PR.
7b6ee9d
to
350026b
Compare
@ksikiric tested this and looks great. Added a few comments, once resolved this should be ready to merge into main. Please rebase with main and run |
… Start creating generation code for flux.
350026b
to
3141d69
Compare
@ksikiric let me know when you can take a look at the latest comments. This is very close to being ready! :) |
@entrpn I can't see any other comments than the ones I have already marked as resolved, are there any other comments that I am missing? |
@ksikiric my bad, forgot to click the button. Take a look and let me know if you see them now. |
@entrpn I fixed those comments now! |
Thank you. |
Related to #147
Adding FA support for GPUs using TransformerEngine, same as in maxtext. These changes are added on top of #147, which has been rebased on flux_lora as per #148.