-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Blackhole Ops] Fast Dispatch Post Commit Issue Master List #12349
Comments
@mywoodstock @tarafdarTT @sjameelTT @bbradelTT @yugaoTT @razorback3 FYI |
FYI - CNN list for Blackhole tracked at - #11124 |
@abhullar-tt I updated the list including my individual issues tracking this. |
Lots of tests fail when input is float32 format, is it not supported or buggy on BH? |
@rtawfik01 @nvelickovicTT do you know if there are issues with float32? |
@yugaoTT float32 has been tested in blackhole, can you provide the simplest test case that fails with float32? Preferably a single-tile case? Its more likely a metal api issue, and the fp32 flag is not passed to blakchole llks fully |
These ops from Falcon7b seem to be missing, it would be good to double check if they're failing as well:
|
BH post commit will run everything that post commit on main runs, unless explicitly tagged with |
Thanks @abhullar-tt |
BH post commit is clean on main and the ops listed by Salar are covered. From Colman's list:
|
paged update cache is tested in this postcommit pipeline https://github.com/tenstorrent/tt-metal/actions/runs/11815965526/job/32918624681#logs |
#12550: re-enable some permute tests, disable the ones that aren't working
#12550: re-enable some permute tests, disable the ones that aren't working
#12550: re-enable some permute tests, disable the ones that aren't working
#12550: re-enable some permute tests, disable the ones that aren't working
#12550: re-enable some permute tests, disable the ones that aren't working
#12550: re-enable some permute tests, disable the ones that aren't working
#12550: re-enable some permute tests, disable the ones that aren't working
#12550: re-enable some permute tests, disable the ones that aren't working
…d do a minor refactor of ttnn::permute (#15881) ### Ticket #14790 add transpose wh sharded implementation when shard shape < height dimension #15165 add N-d permute with width dimension #15589 correct permute dimensionality when less than 4D #15750 remove the composite flag from permute #12550 re-enable some permute tests for blackhole #12349 re-enable working transpose tests for blackhole #16066 disable test uniform as it's stochastic ### Problem description This PR addresses several permute and transpose problems all at once - Transpose WH sharded does not currently work when the shard shape is less than the height - Permute on greater than 4 dimensions does not work when moving width around (for both tiled and RM) - The Permute kernel when width doesn't change is single core - Permute has an unclean API in which we have a composite flag that is not generically applicable - Permute on less than 4 dimensions gets an incorrect output shape in cases where it's a no-op - Permute tests are disabled for BH due to LLK issues - Transpose tests are disabled for BH due to LLK issues ### What's changed - Add transpose WH sharded implementation for when shard shape is less than the height dim (outputs a block sharded output) - Add an N-d permute kernel that works generically on any row major input. We have to call a global init each loop of the compute kernel as transpose sets some registers that aren't cleared (there's no transpose_uninit). This results in bad pcc when there's more than one loop. For GS/BH, even the global init doesn't solve the problem so the test is disabled. For Tiled, we need 5D untilize/tilize. This increases sweeps coverage for permute from **50%** to **86%** - For the optimized case where Permute's width dimension is not shuffled, make the kernel multicore - Remove composite flag that is default set to to make permute non-generic. This has caused forge models to have bad pcc as they were not aware of this optional argument. - Refactor ttnn::permute to add nop checks and correct shape calculations - Re-enable permute and transpose tests for blackhole When replacing variants of transpose with this RM permute kernel, a lot of tests on BH/GS failed, so I will do that in a follow-up to address. The LLK issues are causing pains there. If we get N-d untilize/tilize support and once the LLK issues are fixed, permute should have the ability to be generic. The remaining issues for the pytorch 2.0 sweeps after the untilize/tilize fix are the CB overflow on transpose wh, which should be fixed out of the box when we replace the kernel that is used (which I am not doing in this PR since it doesn't work for GS/BH atm). ### Checklist - [x] Post commit CI passes https://github.com/tenstorrent/tt-metal/actions/runs/12367177499/job/34547311782 (failing test is failing on main) - [x] Blackhole Post commit (if applicable) https://github.com/tenstorrent/tt-metal/actions/runs/12367175575 - [x] Model regression CI testing passes (if applicable) https://github.com/tenstorrent/tt-metal/actions/runs/12357119737 - [x] Device performance regression CI testing passes (if applicable) https://github.com/tenstorrent/tt-metal/actions/runs/12357115316 - [ ] **(For models and ops writers)** Full [new models](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) tests passes - [ ] New/Existing tests provide coverage for changes
This is a master list of all tests that needed to be disabled to enable passing
fast-dispatch-build-and-unit-tests.yaml
on BHGoal: Remove instances of
skip_for_blackhole
that are not skipped for WHMatmul
Softmax
LayerNorm
Moreh Ops
Loss Ops
Downsample
Multi Op
Rotary Embeddings (not a TM, does a positional transformation using a matmul)
TMs
LLKs
unpack_tilizeA_B_block
#15756Reduce
Misc.
Sweeps
The text was updated successfully, but these errors were encountered: