You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Analyze and report accuracy failures in Github CI
Currently, the standalone scripts will run and fail silently. The logs will show which models fail and where, but this is not easy to view.
Possibly dump a json file if a comparison fails and have the CI process it.
Organize into a base module that only exports standalone runnable code.
Separate modules to export additional code:
Separate forward functions into separate definitions that mimics the original test model more closely.
It's currently simpler to fuse all of the forward functions since the input data is only applicable for the first forward function. But this may not be representative of the original model with separate graph breaks.
Using safetensors instead of pickle
There are issues with saving the tensors in this format because of share memory. save_model won't work because this is not a torch.nn.Module.
RuntimeError:
Some tensors share memory, this will lead to duplicate memory on disk and potential differences when loading them again: [{'arg0_1', 'arg27_1'}].
A potential way to correctly save your model is to use `save_model`.
More information at https://huggingface.co/docs/safetensors/torch_shared_tensors
inspect.getsource() assumes the target function is always accessible and valid. This may fail if node.target is dynamically created or a non-source-mapped function.
The text was updated successfully, but these errors were encountered:
Original PR: #611
Improvements
Analyze and report accuracy failures in Github CI
Currently, the standalone scripts will run and fail silently. The logs will show which models fail and where, but this is not easy to view.
Possibly dump a json file if a comparison fails and have the CI process it.
Organize into a base module that only exports standalone runnable code.
Separate modules to export additional code:
https://github.com/tenstorrent/pytorch2.0_ttnn/tree/kw/gen_acc_tests_for_profiling
Separate
forward
functions into separate definitions that mimics the original test model more closely.It's currently simpler to fuse all of the forward functions since the input data is only applicable for the first
forward
function. But this may not be representative of the original model with separate graph breaks.Using safetensors instead of pickle
There are issues with saving the tensors in this format because of share memory.
save_model
won't work because this is not atorch.nn.Module
.inspect.getsource()
assumes the target function is always accessible and valid. This may fail ifnode.target
is dynamically created or a non-source-mapped function.The text was updated successfully, but these errors were encountered: