Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BERT has not final model #1439

Open
dangokuson opened this issue Oct 25, 2024 · 3 comments
Open

BERT has not final model #1439

dangokuson opened this issue Oct 25, 2024 · 3 comments

Comments

@dangokuson
Copy link

dangokuson commented Oct 25, 2024

Describe the bug
I tried to optimize BERT model with bert_ptq_cpu.json but it gave 7 output models.
It there any ways or change the config to get only one output model?

[2024-10-25 10:54:59,192] [INFO] [engine.py:816:_run_passes] Run model evaluation for the final model...
[2024-10-25 10:54:59,195] [INFO] [footprint.py:101:create_pareto_frontier] Output all 7 models
[2024-10-25 10:54:59,196] [INFO] [footprint.py:120:_create_pareto_frontier_from_nodes] pareto frontier points: 3_OrtSessionParamsTuning-2-231aed55-cpu-cpu 
{
  "accuracy-accuracy": 0.8529411764705882,
  "accuracy-f1": 0.8913043478260869,
  "latency-avg": 48.46022,
  "latency-max": 65.62145,
  "latency-min": 40.43884,
  "throughput-avg": 20.34093,
  "throughput-max": 23.00423,
  "throughput-min": 16.20369
}
[2024-10-25 10:54:59,206] [INFO] [engine.py:367:run_accelerator] Save footprint to /Users/ubuntu/workspace/projects/AI_Research/Olive/examples/bert/models/bert_ptq_cpu/footprints.json.
[2024-10-25 10:54:59,214] [INFO] [engine.py:294:run] Run history for cpu-cpu:
[2024-10-25 10:54:59,234] [INFO] [engine.py:550:dump_run_history] run history:
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+--------------------------------------------+
| model_id                                         | parent_model_id                                  | from_pass                   |   duration_sec | metrics                                    |
+==================================================+==================================================+=============================+================+============================================+
| 9785a767                                         |                                                  |                             |                | {                                          |
|                                                  |                                                  |                             |                |   "accuracy-accuracy": 0.8602941176470589, |
|                                                  |                                                  |                             |                |   "accuracy-f1": 0.9042016806722689,       |
|                                                  |                                                  |                             |                |   "latency-avg": 77.18956,                 |
|                                                  |                                                  |                             |                |   "latency-max": 104.18961,                |
|                                                  |                                                  |                             |                |   "latency-min": 66.36365,                 |
|                                                  |                                                  |                             |                |   "throughput-avg": 13.79494,              |
|                                                  |                                                  |                             |                |   "throughput-max": 16.03933,              |
|                                                  |                                                  |                             |                |   "throughput-min": 11.9764                |
|                                                  |                                                  |                             |                | }                                          |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+--------------------------------------------+
| 0_OnnxConversion-9785a767-0b0c1267               | 9785a767                                         | OnnxConversion              |        26.7029 |                                            |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+--------------------------------------------+
| 1_OrtTransformersOptimization-0-67b9c681-cpu-cpu | 0_OnnxConversion-9785a767-0b0c1267               | OrtTransformersOptimization |        12.8088 |                                            |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+--------------------------------------------+
| 2_OnnxQuantization-1-133a6d82                    | 1_OrtTransformersOptimization-0-67b9c681-cpu-cpu | OnnxQuantization            |        37.2535 |                                            |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+--------------------------------------------+
| 3_OrtSessionParamsTuning-2-231aed55-cpu-cpu      | 2_OnnxQuantization-1-133a6d82                    | OrtSessionParamsTuning      |       103.319  | {                                          |
|                                                  |                                                  |                             |                |   "accuracy-accuracy": 0.8529411764705882, |
|                                                  |                                                  |                             |                |   "accuracy-f1": 0.8913043478260869,       |
|                                                  |                                                  |                             |                |   "latency-avg": 48.46022,                 |
|                                                  |                                                  |                             |                |   "latency-max": 65.62145,                 |
|                                                  |                                                  |                             |                |   "latency-min": 40.43884,                 |
|                                                  |                                                  |                             |                |   "throughput-avg": 20.34093,              |
|                                                  |                                                  |                             |                |   "throughput-max": 23.00423,              |
|                                                  |                                                  |                             |                |   "throughput-min": 16.20369               |
|                                                  |                                                  |                             |                | }                                          |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+--------------------------------------------+
| 4_OnnxQuantization-1-80ae4847                    | 1_OrtTransformersOptimization-0-67b9c681-cpu-cpu | OnnxQuantization            |        36.2477 |                                            |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+--------------------------------------------+
| 5_OrtSessionParamsTuning-4-231aed55-cpu-cpu      | 4_OnnxQuantization-1-80ae4847                    | OrtSessionParamsTuning      |        86.8893 | {                                          |
|                                                  |                                                  |                             |                |   "accuracy-accuracy": 0.8406862745098039, |
|                                                  |                                                  |                             |                |   "accuracy-f1": 0.8811700182815356,       |
|                                                  |                                                  |                             |                |   "latency-avg": 65.15545,                 |
|                                                  |                                                  |                             |                |   "latency-max": 74.85309,                 |
|                                                  |                                                  |                             |                |   "latency-min": 51.80688,                 |
|                                                  |                                                  |                             |                |   "throughput-avg": 15.95106,              |
|                                                  |                                                  |                             |                |   "throughput-max": 17.65159,              |
|                                                  |                                                  |                             |                |   "throughput-min": 14.35173               |
|                                                  |                                                  |                             |                | }                                          |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+--------------------------------------------+

To Reproduce

olive run --config bert_ptq_cpu.json

Expected behavior
A clear and concise description of what you expected to happen.

Olive config
Olive configurations here: https://github.com/microsoft/Olive/blob/main/examples/bert/bert_ptq_cpu.json

Olive logs
Add logs here.

Other information

  • OS: MacOS
  • Olive version: main
  • ONNXRuntime package and version: 1.19.2
  • Transformers package version: 4.42.4

Additional context

onnx                               1.17.0
onnx-tool                          0.9.0
onnxconverter-common               1.14.0
onnxexplorer                       0.2.7
onnxruntime                        1.19.2
onnxruntime_extensions             0.12.0
onnxruntime-tools                  1.7.0
onnxsim                            0.4.36
skl2onnx                           1.17.0
tf2onnx                            1.16.1
@xiaoyu-work
Copy link
Contributor

You can find the output model path in footprint.json. I had a PR open for copying output model to output_dir: #1430. Once the PR got merged, you can pull the main branch and run the optimization again.

@dangokuson
Copy link
Author

dangokuson commented Oct 28, 2024

@xiaoyu-work So, I have to compare their accuracy and pick the best model, or use the last one?

@xiaoyu-work
Copy link
Contributor

The first model in output_footprints.json is the best one. The output models are ranked by metrics in this file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants