falcon-7b-instruct failure due to graph changes #701

kevinwuTT · 2025-01-07T20:06:41Z

Recently, there is a failure in running falcon-7b-instruct on main: https://github.com/tenstorrent/pytorch2.0_ttnn/actions/runs/12656960036/job/35270930666

It appears that this subgraph containing aten.arange and aten.argmax wasn't there previously https://github.com/tenstorrent/pytorch2.0_ttnn/blob/34e84c81d517650dbd259c445957356c83531440/docs/models/Falcon/input_variations.md

arg0_1: {'val': FakeTensor(..., size=(1, 7), dtype=torch.int64), 'tensor_meta': TensorMetadata(shape=torch.Size([1, 7]), dtype=torch.int64, requires_grad=False, stride=(7, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})}

def forward(self, arg0_1):                                                   
  arange = torch.ops.aten.arange.start_step(7, 0, -1, device = device(type='cpu'), pin_memory = False)
  mul = torch.ops.aten.mul.Tensor(arg0_1, arange)
  argmax = torch.ops.aten.argmax.default(mul, 1, True)
  gt = torch.ops.aten.gt.Scalar(argmax, 0)
  return (gt, argmax)

but these ops are still lowered:

def forward(self, arg0_1):
  ttnn_arange = ttnn_decorators_ttnn_arange(7, 0, -1)
  ttnn_from_torch = ttnn_decorators_ttnn_from_torch(arg0_1, device = ttnn_Specified_Device, layout = ttnn_TILE_LAYOUT, dtype = ttnn_bfloat16);  arg0_1 = None
  ttnn_multiply = ttnn_decorators_ttnn_multiply(ttnn_from_torch, ttnn_arange);  ttnn_from_torch = ttnn_arange = None
  ttnn_from_device = ttnn_decorators_ttnn_from_device(ttnn_multiply);  ttnn_multiply = None
  ttnn_to_layout = ttnn_decorators_ttnn_to_layout(ttnn_from_device, ttnn_ROW_MAJOR_LAYOUT);  ttnn_from_device = None
  ttnn_argmax = ttnn_decorators_ttnn_argmax(ttnn_to_layout, dim = 1);  ttnn_to_layout = None
  ttnn_to_torch = ttnn_decorators_ttnn_to_torch(ttnn_argmax, dtype = torch.int64);  ttnn_argmax = None
  gt_scalar = torch.ops.aten.gt.Scalar(ttnn_to_torch, 0)
  return (gt_scalar, ttnn_to_torch)

These input variations have issues currently:
ttnn.arange(7, 0, -1)

RuntimeError: TT_FATAL @ /tmp/build-via-sdist-d26xvola/ttnn-0.54.0rc18+wormhole.b0/ttnn/cpp/ttnn/operations/eltwise/binary/device/binary_device_operation.cpp:229: dim_a == dim_b
info:
Incompatible dimensions 7 and 0

ttnn.argmax(mul, 1, True) where mul has torch.Size([1, 7]) shape

RuntimeError: TT_THROW @ /tmp/build-via-sdist-d26xvola/ttnn-0.54.0rc18+wormhole.b0/ttnn/cpp/ttnn/device_operation.hpp:487: tt::exception
info:
Unsupported storage type

I'm not sure why this issue appeared suddenly since there doesn't seem to be any direct changes that affects this model. The model and weights also haven't changed recently either https://huggingface.co/tiiuae/falcon-7b-instruct/tree/main.

The text was updated successfully, but these errors were encountered:

kevinwuTT added the model-test label Jan 7, 2025

kevinwuTT added this to PyTorch 2.0 TT-NN Compiler Jan 7, 2025

kevinwuTT mentioned this issue Jan 22, 2025

Fix and add workaround for falcon-7b model test #719

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

falcon-7b-instruct failure due to graph changes #701

falcon-7b-instruct failure due to graph changes #701

kevinwuTT commented Jan 7, 2025

falcon-7b-instruct failure due to graph changes #701

falcon-7b-instruct failure due to graph changes #701

Comments

kevinwuTT commented Jan 7, 2025