Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BH linear test PCC failure #12539

Closed
bbradelTT opened this issue Sep 11, 2024 · 12 comments
Closed

BH linear test PCC failure #12539

bbradelTT opened this issue Sep 11, 2024 · 12 comments

Comments

@bbradelTT
Copy link
Contributor

sweep_tests/pytests/helper_funcs/test_linear.py::test_linear_with_bias PCC failure with input shape [1, 1, 64, 128]

See #12349

@bbradelTT
Copy link
Contributor Author

To run the test need to use:

pytest "tests/tt_eager/python_api_testing/sweep_tests/pytests/helper_funcs/test_linear.py::test_linear_with_bias"

Found that behaviour is ND:

FAILED tests/tt_eager/python_api_testing/sweep_tests/pytests/helper_funcs/test_linear.py::test_linear_with_bias[input_shapes1] - AssertionError: linear test failed with input shape [[1, 1, 64, 128]]. Max ATOL Delta: 318.1839599609375, Max RTOL Delta:...

vs

PASSED tests/tt_eager/python_api_testing/sweep_tests/pytests/helper_funcs/test_linear.py::test_linear_with_bias[input_shapes1]

@bbradelTT
Copy link
Contributor Author

Can also just run

pytest "tests/tt_eager/python_api_testing/sweep_tests/pytests/helper_funcs/test_linear.py::test_linear_with_bias[input_shapes1]"

@bbradelTT
Copy link
Contributor Author

The no bias test fails randomly as well:

FAILED tests/tt_eager/python_api_testing/sweep_tests/pytests/helper_funcs/test_linear.py::test_linear_no_bias[input_shapes1] - AssertionError: linear test failed with input shape [[1, 1, 64, 128]]. Max ATOL Delta: 298.13507080078125, Max RTOL Delta...

@bbradelTT
Copy link
Contributor Author

bbradelTT commented Oct 3, 2024

Disabling cache makes the PCC worse. [Edit: Either that, or this is card specific. The cache disabled was on yyzo-bh-04]
By default:

Delta: 286.7896423339844, Max RTOL Delta: 347.5240783691406, PCC: 0.5228953954116297, PCC check failed pytorch tensor([[[[  30.1292,   -8.1660,   -9.1197,  ...,  -42.8547,  -26.1633,
            170.2308],  
          [ -16.0048,   28.4046,   63.4314,  ...,   55.6276,  -75.5297,
             -5.3420],  
          [-125.9157,   10.2390,   35.6248,  ...,    1.4868,   47.4886,
            -28.7936],
          ...,
          [   2.8423,   29.4760,  159.3267,  ...,  105.7456,   94.1220,
             -3.5739],  
          [ 144.4431,   76.5388,  -39.9138,  ...,   41.7540,   13.7142,
            -22.6677],  
          [ -16.5986,   -9.1731,  -41.9026,  ...,   -1.9923,   -4.9434,
             75.5792]]]]) ttlib tensor([[[[ 136.0000,   54.2500,    2.9844,  ...,  -50.7500,  -47.5000,
            -43.7500],  
          [   2.3750,   66.5000,    7.5312,  ...,  -63.7500, -162.0000,
            -96.5000],  
          [  14.8750,   -9.6875,  -43.5000,  ...,   45.2500,   33.7500,
              8.3125],
          ...,
          [   2.9688,   29.6250,  159.0000,  ...,  105.5000,   95.0000,
             -3.7656],
          [ 145.0000,   76.0000,  -39.5000,  ...,   41.5000,   13.7500,
            -23.2500],
          [ -16.3750,   -9.3125,  -41.7500,  ...,   -1.9375,   -5.0938,
             75.5000]]]], dtype=torch.bfloat16)

Note that the first parts of the tensors are different while the second parts of the tensors line up.

With:

export TT_METAL_DISABLE_L1_DATA_CACHE_RISCVS="BR,NC,TR" 

the PCC is

Max ATOL Delta: 3610.570068359375, Max RTOL Delta: 3264.5302734375, PCC: 0.03516976696369031

and the tensors look like

pytorch tensor([[[[  30.1292,   -8.1660,   -9.1197,  ...,  -42.8547,  -26.1633,
            170.2308],
          [ -16.0048,   28.4046,   63.4314,  ...,   55.6276,  -75.5297,
             -5.3420],
          [-125.9157,   10.2390,   35.6248,  ...,    1.4868,   47.4886,
            -28.7936],
          ...,
          [   2.8423,   29.4760,  159.3267,  ...,  105.7456,   94.1220,
             -3.5739],
          [ 144.4431,   76.5388,  -39.9138,  ...,   41.7540,   13.7142,
            -22.6677],
          [ -16.5986,   -9.1731,  -41.9026,  ...,   -1.9923,   -4.9434,
             75.5792]]]]) ttlib tensor([[[[ 1.0078e+00,  7.9102e-02, -7.8125e-01,  ..., -8.6800e+02,
           -9.1600e+02, -9.7200e+02],
          [ 1.0078e+00,  7.9102e-02, -7.8125e-01,  ...,  3.6000e+02,
            1.6100e+02,  2.8800e+02],
          [ 1.0078e+00,  7.9102e-02, -7.8125e-01,  ..., -1.1680e+03,
           -1.1840e+03, -1.1280e+03],
          ...,
          [ 1.2560e+03,  1.3760e+03,  1.5360e+03,  ...,  1.1500e+02,
            9.7500e+01, -3.7750e+01],
          [ 8.0400e+02,  7.6000e+02,  7.2000e+02,  ...,  5.0750e+01,
            8.7000e+01, -4.9688e+00],
          [-4.3750e+01, -3.7500e+01, -3.4250e+01,  ...,  3.7500e+01,
            4.2500e+01,  8.1500e+01]]]], dtype=torch.bfloat16)


Note that none of the parts of the tensors line up.

@bbradelTT
Copy link
Contributor Author

bbradelTT commented Oct 3, 2024

When yyzo-bh-04 is in a bad state, when in0 and in1 are 32x32 tensors with all 1s, I found printing CB in0 and CB in1 via

inline void print_full_tile(uint32_t cb_id, uint32_t tile_id = 0, bool untilize = false) {
    UNPACK(( DPRINT << "======" << ENDL() ));
    for (uint16_t r = 0; r < 32; ++ r) {
        SliceRange sr = SliceRange{.h0 = r, .h1 = (uint16_t)(r+1), .hs = 1, .w0 = 0, .w1 = 32, .ws = 1};
        UNPACK(( DPRINT << (uint)r << TileSlice(cb_id, tile_id, sr, true, untilize) << ENDL() ));
    }           
    UNPACK(( DPRINT << "++++++" << ENDL() ));
}

resulted in:

======
00 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

10 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

20 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

30 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

40 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

50 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

60 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

70 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

80 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

90 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

100 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

110 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

120 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

130 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

140 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

150 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

160 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

170 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

180 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

190 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

200 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

210 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

220 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

230 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

240 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

250 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

260 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

270 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

280 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

290 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

300 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

310 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

++++++

vs on GS

======
01 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

21 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

31 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

41 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

51 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

61 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

71 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

81 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

91 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

101 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

111 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

121 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

131 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

141 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

151 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

161 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

171 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

181 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

191 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

201 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

211 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

221 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

231 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

241 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

251 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

261 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

271 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

281 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

291 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

301 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

311 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

++++++

cc @ncvetkovicTT @nvelickovicTT @rtawfik01

@bbradelTT
Copy link
Contributor Author

Changed the input to bfloat8_b instead of bfloat16 and got weird results:

======
03.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38 3.38953e+38

13 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

23 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

33 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

43 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

53 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

63 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

73 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

83 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

93 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

103 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

113 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

123 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

133 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

143 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

153 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

163 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

172.36936e-38 9.7713e-38 1.92487e-37 2.87996e-37 1.96895e-37 2.47956e-38 1.93957e-37 1.21223e-38 1.46202e-37 5.25299e-38 3.0122e-37 1.26733e-38 1.89181e-38 2.98282e-37 4.77545e-38 4.84891e-38 1.41794e-37 3.58158e-38 2.89465e-37 3.72852e-38 2.85057e-37 1.2857e-38 3.78362e-38 1.77243e-38 2.40609e-38 3.54485e-38 4.88565e-38 2.58976e-38 3.61832e-38 5.21626e-38 1.49876e-37 1.02121e-37

182.04242e-37 1.78161e-38 7.45704e-38 2.07181e-37 3.71015e-38 2.57139e-38 7.16317e-38 1.18468e-38 7.31011e-38 4.92238e-38 1.23978e-38 2.96812e-37 5.06932e-38 1.95426e-37 3.59995e-38 1.86426e-38 1.45467e-37 1.98365e-37 1.88263e-38 9.91823e-38 4.81218e-38 1.01386e-37 1.22141e-38 2.44282e-38 1.29488e-38 1.48406e-37 1.5061e-37 1.42529e-37 2.42446e-38 2.05712e-37 4.95912e-38 1.2306e-38

191.79998e-38 1.91018e-37 9.47742e-38 9.84477e-38 7.0897e-38 2.92404e-37 2.55303e-38 1.02856e-37 7.23664e-38 1.0359e-37 1.43263e-37 1.87344e-38 1.85508e-38 9.55089e-38 9.62436e-38 1.04325e-37 2.38772e-38 7.60398e-38 1.43998e-37 1.0506e-37 1.20304e-38 7.1999e-38 1.80916e-38 2.46119e-38 1.27651e-38 3.56322e-38 5.14279e-38 2.02773e-37 1.44733e-37 2.83588e-37 2.90935e-37 1.19386e-38

203.0269e-37 7.27337e-38 1.49141e-37 1.89548e-37 1.18468e-38 3.67342e-40 1.01386e-37 3.80199e-38 4.81218e-38 2.40609e-38 5.06932e-38 2.60813e-38 9.18355e-41 1.42529e-37 1.19386e-38 9.47742e-38 4.73871e-38 5.03259e-38 1.41794e-37 7.12643e-38 2.46119e-38 1.87344e-38 5.96931e-39 2.42446e-38 1.19386e-39 7.27337e-38 1.00652e-37 7.0897e-38 3.61832e-38 3.58158e-38 6.06114e-39 2.38772e-38

214.84891e-38 1.0359e-37 7.60398e-38 1.86426e-38 7.07133e-39 1.10203e-39 6.79583e-39 7.38357e-38 6.9795e-39 1.49876e-37 1.25815e-38 3.54485e-38 2.75506e-40 1.45467e-37 9.55089e-38 4.59177e-40 3.78362e-38 3.74689e-38 1.31325e-38 5.17952e-38 1.01019e-39 4.88565e-38 1.2857e-39 1.43998e-37 4.77545e-38 1.0506e-37 7.53051e-38 5.25299e-38 3.72852e-38 7.16317e-38 5.10605e-38 3.56322e-38

221.21223e-38 1.80916e-38 6.33665e-39 1.04325e-37 2.44282e-38 6.15298e-39 4.92238e-38 7.1999e-38 2.58976e-38 1.5208e-37 2.51629e-38 3.71015e-38 1.47671e-37 5.51013e-40 3.63669e-38 6.88766e-39 5.14279e-38 9.7713e-38 1.26733e-38 6.70399e-39 9.62436e-38 1.84589e-38 3.76526e-38 1.81834e-38 1.49141e-37 7.45704e-38 1.43263e-37 1.2306e-38 1.20304e-38 1.30406e-38 7.49378e-38 6.24481e-39

233.69179e-38 1.48406e-37 1.02856e-37 7.16317e-39 1.90099e-38 1.85508e-38 2.6265e-38 8.26519e-40 7.23664e-38 1.79998e-38 2.55303e-38 1.79079e-38 1.29488e-38 9.69783e-38 9.84477e-38 1.89181e-38 1.88263e-38 1.78161e-38 7.56724e-38 2.36936e-38 1.2857e-38 1.77243e-38 1.5061e-37 2.57139e-38 1.02121e-37 7.42031e-38 1.44733e-37 5.21626e-38 3.59995e-38 6.42848e-39 1.27651e-38 1.37753e-39

249.18355e-40 1.51345e-37 1.83671e-40 2.53466e-38 1.22141e-38 7.255e-39 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

250 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

@bbradelTT
Copy link
Contributor Author

Just tried on yyzo-bh-05, which is in a good state and is producing the correct output.

The print_full_tile() method is printing the same results. Seems like that is an issue of not working the same way for blackhole.

@bbradelTT
Copy link
Contributor Author

This is similar to #12602 except there are more shapes involved.

In tests/tt_eager/python_api_testing/sweep_tests/pytests/helper_funcs/test_linear.py for test_linear_with_bias I have the following shapes that I can alternate between:

    "input_shapes",
    (
        #[[1, 1, 32, 64], [1, 1, 256, 64], [1, 1, 1, 256]], # Passes
        #[[1, 1, 64, 128], [1, 1, 32, 128], [1, 1, 1, 32]], # Passes or bad PCC
        #[[1, 1, 32, 32], [1, 1, 32, 32], [1, 1, 1, 32]], # Passes or bad PCC or long wait (hang) behaviour
    ),

Just uncomment one and then run

pytest "tests/tt_eager/python_api_testing/sweep_tests/pytests/helper_funcs/test_linear.py::test_linear_with_bias"

This also looks like there's a tilize call earlier.

@nvelickovicTT would you be able to take a look?
Probably start with [[1, 1, 64, 128], [1, 1, 32, 128], [1, 1, 1, 32]] since the turnaround on bad PCC would probably be faster.

@bbradelTT
Copy link
Contributor Author

It seems that the problem problem shapes have N=32, even though there's no mcast in these kernels, where we had issues because mcast requires more than one destination, there may be other assumptions that are violated.

@bbradelTT
Copy link
Contributor Author

Looks like it's odd tile counts on the last dimension for tilize, similar to #12602

The following resulted in hanging behaviour:

[[1, 1, 64, 128], [1, 1, 96, 128], [1, 1, 1, 96]]

@bbradelTT
Copy link
Contributor Author

The same fix that should fix #12602 causes the test to pass.

@bbradelTT
Copy link
Contributor Author

tenstorrent/tt-llk-bh#45 has been merged.
Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants