Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Flash Attention 3 Support for MI300X GPUs #71

Open
codinggosu opened this issue Aug 1, 2024 · 1 comment
Open

[Feature]: Flash Attention 3 Support for MI300X GPUs #71

codinggosu opened this issue Aug 1, 2024 · 1 comment

Comments

@codinggosu
Copy link

codinggosu commented Aug 1, 2024

Suggestion Description

Context:
We are currently using the latest MI300X GPUs for performance evaluation and have observed that Flash Attention 2 provides a 20% performance improvement over our existing setup. This enhancement has been significant for our projects.

Request:
We are eager to explore the benefits that Flash Attention 3 could offer. However, we were disappointed to find that it currently only supports Hopper architectures.

Inquiry:

Is there a timeline or plan for supporting Flash Attention 3 on MI300X GPUs?
If there is no official plan, are there any guides or resources available that could assist us in implementing support ourselves?
Thank you for your attention to this matter. We look forward to any information you can provide.

Operating System

Ubuntu

GPU

MI300X

ROCm Component

No response

@poyenc
Copy link

poyenc commented Feb 12, 2025

Thanks for your interest. For CK (Composable Kernel) backend, currently we have a dedicated branch, ck_tile/fa3, which provides preliminary FA v3 backward functionality support. However, we don't have a timeline of when adding FA v3 forward support,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants