-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge OpenAI Triton commit b39c1e1
#3302
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This PR is to use a more efficient approach for the type conversion from fp32 to bf16 in the hip backend. According to a simple unit test: the number of VGPR used decreases from 18 to 10.
This is a minor change, when implementing PR #5549 I used: ```rewriter.notifyMatchFailure``` in place of ```return failure();``` as per suggestions to leverage MLIR infra for errors. We should probably be consistent throughout the file and use the MLIR infra for the other buffer ops.
https://github.com/triton-lang/triton/pull/5542/files introduced a dependency on an unordered_map without requiring the unordered_map include path. This works on some systems currently if other headers include unordered map, but fails on others.
Initial support for Nvidia Blackwell GPUs (sm_100). The key contributions included in this PR are: * Support for 5th generation Tensor Core. * Modeling and support of Tensor Memory. * Native support for microscaling formats mxfp4 and mxfp8. * Improvements to the software pipeliner to take advantage of Tensor Cores and Tensor memory This was developed in close collaboration between Nvidia and OpenAI. From Nvidia: dePaul Miller (@depaulmillz) Samantha Hirsch (@Sam3077) Yujia Zhai (@yzhaiustc) Shang Zhang (@shangz-ai) Pradeep Ramani (@IonThruster) Matthew Brookhart (@mbrookhart) Masahiro Masuda (@masahi) Chris Sullivan (@csullivan) Clive Unger (@CliveUnger) Jason Knight (@binarybana) From OpenAI: Pawel Szczerbuk (@pawelszczerbuk) Peter Bell (@peterbell10) Phil Tillet (@ptillet) Jeff Niu (@jeffniu-openai) Thomas Raoux (@ThomasRaoux) Co-authored-by: Baogang Song <[email protected]> Co-authored-by: Pawel Szczerbuk <[email protected]> Co-authored-by: Sergei Vorobev <[email protected]> Co-authored-by: ionthruster <[email protected]> Co-authored-by: dePaul Miller <[email protected]> Co-authored-by: Matthew Brookhart <[email protected]> Co-authored-by: Chris Sullivan <[email protected]> Co-authored-by: Masahiro Masuda <[email protected]> Co-authored-by: Chris Sullivan <[email protected]> Co-authored-by: peterbell10 <[email protected]> Co-authored-by: jeffniu-openai <[email protected]>
anmyachev
approved these changes
Jan 28, 2025
Fix build failure from `b39c1e1`. Signed-off-by: Whitney Tsang <[email protected]>
4d8da9e
to
8545a9f
Compare
8a24902
to
37388dd
Compare
Signed-off-by: Whitney Tsang <[email protected]>
37388dd
to
d5d889e
Compare
…ast '/std:c++20'` Signed-off-by: Whitney Tsang <[email protected]>
1d45243
to
8852bb3
Compare
Signed-off-by: Whitney Tsang <[email protected]>
8852bb3
to
3f44826
Compare
47c730b
b39c1e1
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR change the Triton base from 0ecb172 to b39c1e1 (Jan 28).
Pass rate: 99.44%-> 98.19% (#3307)
Please do not squash and merge this PR.