You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the only unpacker kernel that unpacks 32x32 tiles is the llk_unpack_AB_matmul.h. The addition was done to saturate the compute kernel for matmul operations. There are other unpacker kernels that could also make use of the same additions:
llk_unpack_A.h
llk_unpack_AB.h
llk_unpack_reduce.h
llk_unpack_tilize.h
llk_unpack_untilize.h
Not only would this be helpful for performance, but also it could remove the unnecessary face dimensions configs:
config_unpacker_x_end<UNP_SEL>(face_r_dim);
That need to be added to init functions such as _llk_unpack_A_init_ to be able to fuse operations that switch between unpacking 16x16 and unpacking 32x32.
Currently the only unpacker kernel that unpacks 32x32 tiles is the
llk_unpack_AB_matmul.h
. The addition was done to saturate the compute kernel for matmul operations. There are other unpacker kernels that could also make use of the same additions:Not only would this be helpful for performance, but also it could remove the unnecessary face dimensions configs:
That need to be added to init functions such as
_llk_unpack_A_init_
to be able to fuse operations that switch between unpacking 16x16 and unpacking 32x32.@ttmtrajkovic @rdjogoTT fyi
The text was updated successfully, but these errors were encountered: