Skip to content

Commit

Permalink
src: cpu: aarch64: lowp_matmul: Make weights constant
Browse files Browse the repository at this point in the history
Setting the weights as constant allows us to avoid redundant
pretranspose and reduction operations in Arm Compute Library (ACL)
every time execute is called (they are now run once and cached).
This delives big speedups especially for relatively small matmuls.

Note that this is a temp fix that needs to be handled carefully by
primitive caches in frameworks, since the ACL object is now holding
more state - i.e. we want to make sure that the cahce maps a layer
with a specific set of weights to the oneDNN primitive storing
those weights.

We're currently working on the proper fix for this which involves
making lowp_gemm stateless and fixed-format in ACL and oneDNN.
  • Loading branch information
fadara01 committed Oct 30, 2024
1 parent 0e6591e commit c22f4ae
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/cpu/aarch64/matmul/acl_lowp_matmul.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ status_t acl_lowp_matmul_t::pd_t::init(engine_t *engine) {
= arm_compute::TensorInfo(arm_compute::TensorShape(N(), K()), 1,
arm_compute::DataType::QASYMM8_SIGNED,
arm_compute::QuantizationInfo(1.0, 0, true));
almc_.wei_tensor_info.set_are_values_constant(false);
almc_.wei_tensor_info.set_are_values_constant(true);

almc_.bia_tensor_info = arm_compute::TensorInfo(
arm_compute::TensorShape(), 1, arm_compute::DataType::F32);
Expand Down

0 comments on commit c22f4ae

Please sign in to comment.