[Performance] Performance degradation with ZenDNN #5

aj-prime · 2024-06-12T16:06:12Z

Describe the issue

I followed the installation instructions described in the section 4 of the README.

Processor Name: AMD EPYC 7V13 64-Core Processor (Azure Cloud)

Performance (QPS):
ZenDNN: 34
CPU: 77

To reproduce

CPU: python -m onnxruntime.transformers.benchmark -m bert-large-uncased --model_class AutoModel -p fp32 -i 3 -t 10 -b 24 -s 16 -n 96 -v --provider cpu

ZenDNN: python -m onnxruntime.transformers.benchmark -m bert-large-uncased --model_class AutoModel -p fp32 -i 3 -t 10 -b 24 -s 16 -n 96 -v --provider zendnn

I tried 64 threads also, but it results in worse performance.

Urgency

No response

Platform

Linux

OS Version

Ubuntu 20.04.4 LTS

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

onnxruntime-zendnn:1.17.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

Unknown

ajeet1203singh · 2024-06-14T10:43:22Z

Hello @aj-prime, can you please let us know the environment variables you are using and can you confirm that "Optimal Memory Allocator Settings Specific to ONNXRT" section in the user guide was followed?

For this case our recommended settings are:
export GOMP_CPU_AFFINITY=0-63 &&
export OMP_NUM_THREADS=64 &&
export OMP_WAIT_POLICY=ACTIVE &&
export OMP_PROC_BIND=FALSE &&
export OMP_DYNAMIC=FALSE &&
export ZENDNN_MATMUL_ALGO=FP32:4 &&
export LD_PRELOAD=$ZENDNN_PARENT_FOLDER/openmp-10.0.1.src/runtime/src/libomp.so

With thp setting as "always"

aj-prime · 2024-06-17T21:42:36Z

Thanks @ajeet1203singh. Using Optimal Memory Allocator Setting resolved the issue.

lauthu · 2024-08-13T09:53:28Z

Hello @ajeet1203singh and @aj-prime , can you please share some number on the expected throughput improvement of ZenDNN 4.2?

I'm also trying to run the transformer benchmark, and also got the similar result (ZenDNN is slower than CPU Execution Provider).

github-actions bot added the model:transformer label Jun 12, 2024

aj-prime closed this as completed Jun 17, 2024

lauthu mentioned this issue Aug 14, 2024

[Performance] benchmark result on ZenDNN EP #6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance] Performance degradation with ZenDNN #5

[Performance] Performance degradation with ZenDNN #5

aj-prime commented Jun 12, 2024

ajeet1203singh commented Jun 14, 2024

aj-prime commented Jun 17, 2024

lauthu commented Aug 13, 2024

[Performance] Performance degradation with ZenDNN #5

[Performance] Performance degradation with ZenDNN #5

Comments

aj-prime commented Jun 12, 2024

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Model File

Is this a quantized model?

ajeet1203singh commented Jun 14, 2024

aj-prime commented Jun 17, 2024

lauthu commented Aug 13, 2024