ZenDNN Release v4.2
The highlights of this release are as follows
-
The ZenDNN library is based on oneDNN v2.6.3, and provides optimizations tailored to enable performant AI inference on AMD EPYCTM servers.
-
The ZenDNN library can be used in the following frameworks through a plug-in:
- TensorFlow v2.16 and later
- PyTorch v2.0 and later
-
The ZenDNN library is integrated with ONNX Runtime v1.17.0.
-
Supports Environment Variables for Tuning Performance
The following environment variables have been added to tune performance:- Memory Pooling (Persistent Memory Caching)
- ZENDNN_ENABLE_MEMPOOL for all TensorFlow models
- Added MEMPOOL support for BF16 models in TensorFlow models
- Convolution Operation
- ZENDNN_CONV_ALGO for all TensorFlow models
- Added new options to ALGO paths
- Matrix Multiplication Operation
- ZENDNN_MATMUL_ALGO for TensorFlow, PyTorch, and ONNX Runtime models
- Added new options, ALGO paths, and an experimental version of auto-tuner for TensorFlow
- Memory Pooling (Persistent Memory Caching)
-
Embedding Bag and Embedding Operators
- Support for Embedding operator
- AVX512 support for Embedding and Embedding Bag kernel
- Two new parallelization strategies for Embedding and Embedding bag operators, namely, Table threading and Hierarchical threading
-
Matrix Multiplication (MatMul) Operators
- MatMul post-ops computation with BLIS kernels
- Weight caching for FP32 JIT and BLIS kernels
- BLIS BF16 kernel support