Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unpack 32x32 tiles #6

Open
5 tasks
rtawfik01 opened this issue May 28, 2024 · 0 comments
Open
5 tasks

Unpack 32x32 tiles #6

rtawfik01 opened this issue May 28, 2024 · 0 comments
Labels
Feature New feature request Performance Feature that helps with performance, not a blocker for functionality

Comments

@rtawfik01
Copy link
Collaborator

Currently the only unpacker kernel that unpacks 32x32 tiles is the llk_unpack_AB_matmul.h. The addition was done to saturate the compute kernel for matmul operations. There are other unpacker kernels that could also make use of the same additions:

  • llk_unpack_A.h
  • llk_unpack_AB.h
  • llk_unpack_reduce.h
  • llk_unpack_tilize.h
  • llk_unpack_untilize.h

Not only would this be helpful for performance, but also it could remove the unnecessary face dimensions configs:

    config_unpacker_x_end<UNP_SEL>(face_r_dim);

That need to be added to init functions such as _llk_unpack_A_init_ to be able to fuse operations that switch between unpacking 16x16 and unpacking 32x32.

@ttmtrajkovic @rdjogoTT fyi

@rtawfik01 rtawfik01 added Performance Feature that helps with performance, not a blocker for functionality Feature New feature request labels May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature New feature request Performance Feature that helps with performance, not a blocker for functionality
Projects
None yet
Development

No branches or pull requests

1 participant