Matmul operation computes:
$$ C[M, N] = A[M, K] * B[K, N] $$
Last two dimensions of input dimensions are interpretted as M, N, K. All other preceding dimensions are interpretted as batch dimensions.
The operation also has broadcasting capabilites which is described in cudnn Backend's matmul operation.
The API to achieve above is:
std::shared_ptr<Tensor_attributes>
Matmul(std::shared_ptr<Tensor_attributes> a, std::shared_ptr<Tensor_attributes> b, Matmul_attributes);
Matmul attributes is a lighweight structure with setters:
Matmul_attributes&
set_name(std::string const&)
Matmul_attributes&
set_compute_data_type(DataType_t value)
Python API:
- matmul
- A
- B
- name
- compute_data_type