-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BH linear test PCC failure #12539
Comments
To run the test need to use:
Found that behaviour is ND:
vs
|
Can also just run
|
The no bias test fails randomly as well:
|
Disabling cache makes the PCC worse. [Edit: Either that, or this is card specific. The cache disabled was on yyzo-bh-04]
Note that the first parts of the tensors are different while the second parts of the tensors line up. With:
the PCC is
and the tensors look like
|
When yyzo-bh-04 is in a bad state, when in0 and in1 are 32x32 tensors with all 1s, I found printing CB in0 and CB in1 via
resulted in:
vs on GS
|
Changed the input to bfloat8_b instead of bfloat16 and got weird results:
|
Just tried on yyzo-bh-05, which is in a good state and is producing the correct output. The print_full_tile() method is printing the same results. Seems like that is an issue of not working the same way for blackhole. |
This is similar to #12602 except there are more shapes involved. In tests/tt_eager/python_api_testing/sweep_tests/pytests/helper_funcs/test_linear.py for test_linear_with_bias I have the following shapes that I can alternate between:
Just uncomment one and then run
This also looks like there's a tilize call earlier. @nvelickovicTT would you be able to take a look? |
It seems that the problem problem shapes have N=32, even though there's no mcast in these kernels, where we had issues because mcast requires more than one destination, there may be other assumptions that are violated. |
Looks like it's odd tile counts on the last dimension for tilize, similar to #12602 The following resulted in hanging behaviour:
|
The same fix that should fix #12602 causes the test to pass. |
tenstorrent/tt-llk-bh#45 has been merged. |
sweep_tests/pytests/helper_funcs/test_linear.py::test_linear_with_bias PCC failure with input shape [1, 1, 64, 128]
See #12349
The text was updated successfully, but these errors were encountered: