Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change decomposition of zeros_like op #58

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

meenakshiramanathan1
Copy link
Contributor

@meenakshiramanathan1 meenakshiramanathan1 commented Jan 9, 2025

Model faces AttributeError: <class 'tvm.ir.op.Op'> has no attribute name_hint and zeros op is binded as global_var and passed as input params to tvmgen_default_forge_main as linked in the log below
detr.log

The zeros_like operation in TVM was originally implemented by mapping it directly to the corresponding tvm.zeros_like function, while the new_zeros and zeros operations were mapped to the full operation as mentioned here. However, this inconsistency caused the zeros operation to remain unbounded during the partition_graph transformation in TVM.

To address this issue, the decomposition of the zeros_like operation was modified to be mapped to zeros op to align with the zeros_like op being mapped finally to the full operation instead. This ensures consistency and resolves the issue of the zeros operation being unbounded during the transformation.

@meenakshiramanathan1 meenakshiramanathan1 marked this pull request as draft January 9, 2025 04:46
@meenakshiramanathan1 meenakshiramanathan1 force-pushed the mramanathan/detr_tvm branch 2 times, most recently from 2ae28ab to d426723 Compare January 16, 2025 06:53
@meenakshiramanathan1 meenakshiramanathan1 changed the title Deprecate enable_tvm_constant_prop flag Change decomposition of zeros_like op Jan 16, 2025
@meenakshiramanathan1 meenakshiramanathan1 marked this pull request as ready for review January 17, 2025 03:37
python/tvm/relay/frontend/pytorch.py Outdated Show resolved Hide resolved
python/tvm/relay/frontend/pytorch.py Outdated Show resolved Hide resolved
@meenakshiramanathan1 meenakshiramanathan1 force-pushed the mramanathan/detr_tvm branch 5 times, most recently from 3689b0a to faa7c99 Compare January 20, 2025 12:09
@vkovinicTT
Copy link
Member

Can you write tests for these 3 OPs?

@vkovinicTT
Copy link
Member

LGTM

@vkovinicTT
Copy link
Member

vkovinicTT commented Jan 22, 2025

Sorry, I missed one thing :) We want to make as few changes in the TVM part as possible since it is a third party library. So it would be nice to track down where the issue arises and try to fix it on forge-fe. Can you for starters add the zeros_like test to forge-fe which recreates this issue with <class 'tvm.ir.op.Op'> has no attribute name_hint , so we could debug it more easily and also for future robustness. If you have any questions or suggestions on how to fix it just let me know.

@meenakshiramanathan1
Copy link
Contributor Author

Sorry, I missed one thing :) We want to make as few changes in the TVM part as possible since it is a third party library. So it would be nice to track down where the issue arises and try to fix it on forge-fe. Can you for starters add the zeros_like test to forge-fe which recreates this issue with <class 'tvm.ir.op.Op'> has no attribute name_hint , so we could debug it more easily and also for future robustness. If you have any questions or suggestions on how to fix it just let me know.

Sure, I will add zeros_like test and look into this further.

@meenakshiramanathan1
Copy link
Contributor Author

meenakshiramanathan1 commented Jan 28, 2025

@nvukobratTT @vkovinicTT
Findings regarding this issue :

  1. Core part of the issue is due to SimplifyExpr pass present in run_relay_passes here, after this pass zeros_like op is converted to zeros op as mentioned here, due to this only shapes get sent as input to this op which arises as an issue later as mentioned in below stages.

  2. Later in transform.AnnotateTarget pass in partition_for_forge, compiler_begin and compiler_end wrapper gets added for all ops but this is skipped for zeros op since inputs for it is shape and not a tensor.

  3. Ref - So this part mentions that the zeros operations would be treated as input variables when they are used as call nodes with no arguments.This is because the code specifically checks if a call node has no arguments (i.e., call->args.size() == 0). For functions like zeros() only shape is passed as input so the code treats them as special tensor operations and directly adds them to the compiler_ends list, without wrapping them with compiler_begin and compiler_end. So zeros op are not wrapped between compiler_begin and compiler_end and considered as input var.

  4. In PartitionGraph pass which happens few steps later AnnotateTarget, ops wrapped between compiler_begin and compiler_end are only brought into tvmgen_default_forge_main_0 , so as a result these ops stay unbinded after partition graph pass and cause this issue.

  5. Possible workaround could be disabling SimplifyExpr pass so that op remains as zeros_like itself since support for it is present in ttnn.
    we have ttnn.zeros_like support - https://docs.tenstorrent.com/tt-metal/latest/ttnn/ttnn/api/ttnn.zeros_like.html but not yet enabled in ttir, if support is added for that , we could map it in the lowering part. Another workaround would be more specifically making changes with how zeros op is handled in SimplifyExpr pass here. Please let me know your thoughts on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants