Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flux inference implementation #146

Merged
merged 31 commits into from
Feb 12, 2025
Merged

Flux inference implementation #146

merged 31 commits into from
Feb 12, 2025

Conversation

entrpn
Copy link
Collaborator

@entrpn entrpn commented Feb 5, 2025

This PR implements Flux dev and schnell for inference.

Features:

  • Loads encoders, vae and transformer to HBM with encoder offload option. All settings available in the flux configs.
  • Optimized flash block sizes for v6e.
  • Flash attention
  • Sharding - DDP and FSDP, allows loading all models on HBM and keeping them loaded using FSDP sharding, while reducing performance slightly.
  • Removed maxdiffusion.transformers and move back to HF transformers library due to support of new models. Originally I ported some of the models from transformers into maxdiffusion.transformers but saw no improvements in sharding them. This will reduce maintenance of those models inside of maxdiffusion.

@jcaraban
Copy link

@entrpn I will close #141 since this PR is in a mergable state now
@ksikiric rebased our training on top of flux_impl and opened a continuation PR #147
After that we still plan to submit another PR with flash_attn support for GPUs, alike to MaxText

@entrpn
Copy link
Collaborator Author

entrpn commented Feb 12, 2025

@entrpn I will close #141 since this PR is in a mergable state now @ksikiric rebased our training on top of flux_impl and opened a continuation PR #147 After that we still plan to submit another PR with flash_attn support for GPUs, alike to MaxText

thanks @jcaraban and @ksikiric appreciate it very much. I plan to merge this PR by eow. On top of that I've been working on LoRA loading for Flux, which is almost complete and I will also be pushing a few fixes for some issues I found. I'll make sure to keep you informed on these items as they move along.

I've also been working with my team internally to ensure you and your employer is attributed for your work. Thanks again!

@entrpn entrpn merged commit 7f0f5bc into main Feb 12, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants