Flash attention for GPUs like in maxtext #149

ksikiric · 2025-02-13T14:37:28Z

Related to #147

Adding FA support for GPUs using TransformerEngine, same as in maxtext. These changes are added on top of #147, which has been rebased on flux_lora as per #148.

entrpn

@ksikiric left some comments. Overall looks good. Last comment is to remove the training part of the code out of this PR and we can work on it on the Flux training PR.

src/maxdiffusion/configs/base_flux_dev.yml

src/maxdiffusion/configs/base_flux_schnell.yml

entrpn · 2025-02-14T22:25:56Z

@ksikiric tested this and looks great. Added a few comments, once resolved this should be ready to merge into main.

Please rebase with main and run ruff --fix and bash code_style.sh.

… Start creating generation code for flux.

entrpn · 2025-02-19T16:13:20Z

@ksikiric let me know when you can take a look at the latest comments. This is very close to being ready! :)

ksikiric · 2025-02-19T17:12:37Z

@entrpn I can't see any other comments than the ones I have already marked as resolved, are there any other comments that I am missing?

src/maxdiffusion/configs/base_flux_dev.yml

README.md

src/maxdiffusion/configs/base_flux_schnell.yml

entrpn · 2025-02-19T17:16:46Z

@ksikiric my bad, forgot to click the button. Take a look and let me know if you see them now.

ksikiric · 2025-02-20T07:13:45Z

@entrpn I fixed those comments now!

src/maxdiffusion/configs/base_flux_dev.yml

entrpn · 2025-02-21T08:35:11Z

@entrpn I fixed those comments now!

Thank you.

entrpn reviewed Feb 13, 2025

View reviewed changes

ksikiric force-pushed the kris/flux-FA branch from 7b6ee9d to 350026b Compare February 14, 2025 14:01

jfacevedo-google and others added 27 commits February 18, 2025 09:37

add support for flux vae. ~ wip

bc6cd42

test for flux vae both encoding and decoding.

5f56257

add clip text encoder test

c7829d1

remove transformers inside maxdiffusion, add transformers dependency.…

572f20d

… Start creating generation code for flux.

add double block to flux

ff04543

forward pass for single double block.

8a0ede4

trying to use scan.

9fe42ba

add single stream block

7e79e05

finish transformer

6641dda

convert pt weights to flax and load transformer state.

d37a278

apply fsdp sharding, do one forward pass in the transformer.

bb91e8e

wip - generate fn

dfe1089

working loop, bad generation

cbc7723

e2e, encoder offloading.

ac14a4b

support both dev and schnell loading. Images still incorrect.

1c8ed7b

flux schnell working

c8196ed

removed unused code.

1f1475d

support dev

b49695a

add sentencepiece requirement

04377df

fix repeated double and single blocks.

f6c25e4

optimized flash block sizes for trillium.

ff24ee1

clean up code and lint

18250c5

fix sdxl generate smoke tests.

303e82a

fix rest of unit tests.

5df1f3c

update readme and some dependencies.

1ec459d

remove unused dependencies.

1f28cb5

initial lora implementation for flux

ff16ba6

jfacevedo-google and others added 6 commits February 18, 2025 09:37

adding another format lora support.

719e6db

Support other format loras. update readme. Run code_style.

91d7f5c

ruff

1e01c67

fix typo in readme.

19e1b8a

Added FA support for GPUs

f05a7be

ruff and code_style

3141d69

ksikiric force-pushed the kris/flux-FA branch from 350026b to 3141d69 Compare February 18, 2025 08:52

entrpn reviewed Feb 19, 2025

View reviewed changes

src/maxdiffusion/configs/base_flux_dev.yml Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

src/maxdiffusion/configs/base_flux_schnell.yml Outdated Show resolved Hide resolved

fixed final comments

2072f7e

ksikiric marked this pull request as ready for review February 20, 2025 09:09

entrpn reviewed Feb 20, 2025

View reviewed changes

src/maxdiffusion/configs/base_flux_dev.yml Outdated Show resolved Hide resolved

Correcting small misstake due to missunderstanding

56771dd

entrpn approved these changes Feb 21, 2025

View reviewed changes

entrpn merged commit 1c9d4c1 into AI-Hypercomputer:main Feb 21, 2025
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flash attention for GPUs like in maxtext #149

Flash attention for GPUs like in maxtext #149

ksikiric commented Feb 13, 2025

entrpn left a comment

entrpn commented Feb 14, 2025

entrpn commented Feb 19, 2025

ksikiric commented Feb 19, 2025

entrpn commented Feb 19, 2025

ksikiric commented Feb 20, 2025 •

edited

Loading

entrpn commented Feb 21, 2025

Flash attention for GPUs like in maxtext #149

Flash attention for GPUs like in maxtext #149

Conversation

ksikiric commented Feb 13, 2025

entrpn left a comment

Choose a reason for hiding this comment

entrpn commented Feb 14, 2025

entrpn commented Feb 19, 2025

ksikiric commented Feb 19, 2025

entrpn commented Feb 19, 2025

ksikiric commented Feb 20, 2025 • edited Loading

entrpn commented Feb 21, 2025

ksikiric commented Feb 20, 2025 •

edited

Loading