Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge OpenAI Triton commit b39c1e1 #3302

Merged
merged 10 commits into from
Jan 30, 2025
Merged

Merge OpenAI Triton commit b39c1e1 #3302

merged 10 commits into from
Jan 30, 2025

Conversation

whitneywhtsang
Copy link
Contributor

@whitneywhtsang whitneywhtsang commented Jan 28, 2025

This PR change the Triton base from 0ecb172 to b39c1e1 (Jan 28).
Pass rate: 99.44%-> 98.19% (#3307)

Please do not squash and merge this PR.

scxiao and others added 4 commits January 27, 2025 10:08
This PR is to use a more efficient approach for the type
conversion from fp32 to bf16 in the hip backend.
According to a simple unit test: the number of VGPR
used decreases from 18 to 10.
This is a minor change, when implementing PR #5549 I used:
```rewriter.notifyMatchFailure``` in place of ```return failure();``` as
per suggestions to leverage MLIR infra for errors.

We should probably be consistent throughout the file and use the MLIR
infra for the other buffer ops.
https://github.com/triton-lang/triton/pull/5542/files introduced a
dependency on an unordered_map without requiring the unordered_map
include path.

This works on some systems currently if other headers include unordered
map, but fails on others.
Initial support for Nvidia Blackwell GPUs (sm_100).

The key contributions included in this PR are:
* Support for 5th generation Tensor Core.
* Modeling and support of Tensor Memory.
* Native support for microscaling formats mxfp4 and mxfp8.
* Improvements to the software pipeliner to take advantage of Tensor
Cores and Tensor memory

This was developed in close collaboration between Nvidia and OpenAI.

From Nvidia:
dePaul Miller (@depaulmillz)
Samantha Hirsch (@Sam3077)
Yujia Zhai (@yzhaiustc)
Shang Zhang (@shangz-ai)
Pradeep Ramani (@IonThruster)
Matthew Brookhart (@mbrookhart)
Masahiro Masuda (@masahi)
Chris Sullivan (@csullivan)
Clive Unger (@CliveUnger)
Jason Knight (@binarybana)

From OpenAI:
Pawel Szczerbuk (@pawelszczerbuk)
Peter Bell (@peterbell10)
Phil Tillet (@ptillet)
Jeff Niu (@jeffniu-openai)
Thomas Raoux (@ThomasRaoux)

Co-authored-by: Baogang Song <[email protected]>
Co-authored-by: Pawel Szczerbuk <[email protected]>
Co-authored-by: Sergei Vorobev <[email protected]>
Co-authored-by: ionthruster <[email protected]>
Co-authored-by: dePaul Miller <[email protected]>
Co-authored-by: Matthew Brookhart <[email protected]>
Co-authored-by: Chris Sullivan <[email protected]>
Co-authored-by: Masahiro Masuda <[email protected]>
Co-authored-by: Chris Sullivan <[email protected]>
Co-authored-by: peterbell10 <[email protected]>
Co-authored-by: jeffniu-openai <[email protected]>
@whitneywhtsang whitneywhtsang marked this pull request as ready for review January 29, 2025 19:19
@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/merge branch 2 times, most recently from 8a24902 to 37388dd Compare January 29, 2025 19:44
@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/merge branch 4 times, most recently from 1d45243 to 8852bb3 Compare January 29, 2025 23:38
@whitneywhtsang whitneywhtsang changed the title Merge OpenAI Triton commit 47c730b Merge OpenAI Triton commit b39c1e1 Jan 30, 2025
@whitneywhtsang whitneywhtsang merged commit 3f44826 into main Jan 30, 2025
5 checks passed
@whitneywhtsang whitneywhtsang deleted the whitneywhtsang/merge branch January 30, 2025 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants