Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set xla_tpu_enable_flash_attention=false to enable libtpu pin update #8008

Merged
merged 3 commits into from
Sep 13, 2024

Conversation

bhavya01
Copy link
Collaborator

This flag enables FlashAttention HLO pass that pattern matches attention and rewrites it as flash attention. This pattern matching is causing issues for our standard dot product attention. Turning it off till we fix the issue with pattern matching.

Also need to update the payload for pallas test.

@bhavya01 bhavya01 added the tpuci label Sep 13, 2024
@bhavya01 bhavya01 self-assigned this Sep 13, 2024
@bhavya01 bhavya01 requested a review from ManfeiBai September 13, 2024 23:14
Copy link
Collaborator

@ManfeiBai ManfeiBai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants