You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Comparing a simple GEMM kernel using block pointers with PyTorch (oneDNN) on PVC 1100:
gbps
tflops
Triton % Torch tflops
Torch A x B
220.7031745
136.4162522
Triton A x B
174.9541537
108.8349622
79.78%
Torch A x B^T
203.4142925
126.5652739
Triton A x B^T
104.0757365
64.59003281
51.03%
Torch A^T x B
218.4317726
135.8922436
Triton A^T x B
42.5237584
26.39050218
19.42%
Torch A^T x B^T
150.3182902
93.28844194
Triton A^T x B^T
34.43304927
21.36935906
22.91%
(captured 2025 Feb 1)
LIBIGC1_VERSION=2.5.11-1077
LEVEL_ZERO_VERSION=1.19.2.0-1076
AGAMA_VERSION=1077
GPU_DEVICE=Intel(R) Data Center GPU Max 1100
TORCH_VERSION=2.7.0
COMPILER_VERSION=2025.0.4
Umbrella issue to track improvements for each of the four individual GEMM kernels above.
The text was updated successfully, but these errors were encountered:
Comparing a simple GEMM kernel using block pointers with PyTorch (oneDNN) on PVC 1100:
(captured 2025 Feb 1)
Umbrella issue to track improvements for each of the four individual GEMM kernels above.
The text was updated successfully, but these errors were encountered: