You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This script runs and creates \Olive-main\examples\whisper\models\conversion-transformers_optimization-onnx_dynamic_quantization-insert_beam_search-prepost\whisper_cpu_int8_cpu-cpu_model.onnx
(olive_env) \Olive-main\examples\whisper>python test_transcription.py --config \Olive-main\examples\whisper\models\conversion-transformers_optimization-onnx_dynamic_quantization-insert_beam_search-prepost\whisper_cpu_int8_cpu-cpu_model.json
Traceback (most recent call last):
File "\Olive-main\examples\whisper\test_transcription.py", line 126, in
output_text = main()
^^^^^^
File "\Olive-main\examples\whisper\test_transcription.py", line 63, in main
model_name = config["input_model"]["model_components"][0]["model_path"]
~~~~~~^^^^^^^^^^^^^^^
KeyError: 'input_model'
(olive_env) \Olive-main\examples\whisper>olive run --config whisper_cpu_int8.json --setup
[2024-08-06 15:01:08,786] [INFO] [run.py:90:get_required_packages] The following packages are required in the local environment: ['onnxruntime']
[2024-08-06 15:01:08,786] [INFO] [run.py:101:install_packages] installing packages: ['onnxruntime']
[2024-08-06 15:01:08,869] [INFO] [run.py:356:check_local_ort_installation] onnxruntime is already installed.
(olive_env) \Olive-main\examples\whisper>olive run --config whisper_cpu_int8.json 2> NUL
[2024-08-06 15:01:41,553] [INFO] [run.py:140:run_engine] Running workflow default_workflow
[2024-08-06 15:01:41,560] [INFO] [cache.py:51:init] Using cache directory: \Olive-main\examples\whisper\cache\default_workflow
[2024-08-06 15:01:41,570] [INFO] [engine.py:1020:save_olive_config] Saved Olive config to \Olive-main\examples\whisper\cache\default_workflow\olive_config.json
[2024-08-06 15:01:41,570] [DEBUG] [run.py:179:run_engine] Registering pass onnxconversion
[2024-08-06 15:01:41,570] [DEBUG] [run.py:179:run_engine] Registering pass orttransformersoptimization
[2024-08-06 15:01:41,570] [DEBUG] [run.py:179:run_engine] Registering pass onnxdynamicquantization
[2024-08-06 15:01:41,570] [DEBUG] [run.py:179:run_engine] Registering pass insertbeamsearch
[2024-08-06 15:01:41,570] [DEBUG] [run.py:179:run_engine] Registering pass appendprepostprocessingops
[2024-08-06 15:01:41,583] [DEBUG] [accelerator_creator.py:130:_fill_accelerators] The accelerator device and execution providers are specified, skipping deduce.
[2024-08-06 15:01:41,583] [DEBUG] [accelerator_creator.py:169:_check_execution_providers] Supported execution providers for device cpu: ['CPUExecutionProvider']
[2024-08-06 15:01:41,586] [DEBUG] [accelerator_creator.py:199:create_accelerators] Initial accelerators and execution providers: {'cpu': ['CPUExecutionProvider']}
[2024-08-06 15:01:41,586] [INFO] [accelerator_creator.py:224:create_accelerators] Running workflow on accelerator specs: cpu-cpu
[2024-08-06 15:01:41,586] [DEBUG] [run.py:235:run_engine] Pass onnxconversion already registered
[2024-08-06 15:01:41,586] [DEBUG] [run.py:235:run_engine] Pass orttransformersoptimization already registered
[2024-08-06 15:01:41,586] [DEBUG] [run.py:235:run_engine] Pass onnxdynamicquantization already registered
[2024-08-06 15:01:41,586] [DEBUG] [run.py:235:run_engine] Pass insertbeamsearch already registered
[2024-08-06 15:01:41,586] [DEBUG] [run.py:235:run_engine] Pass appendprepostprocessingops already registered
[2024-08-06 15:01:41,586] [DEBUG] [cache.py:304:set_cache_env] Set OLIVE_CACHE_DIR: \Olive-main\examples\whisper\cache\default_workflow
[2024-08-06 15:01:41,604] [INFO] [engine.py:277:run] Running Olive on accelerator: cpu-cpu
[2024-08-06 15:01:41,604] [INFO] [engine.py:1118:_create_system] Creating target system ...
[2024-08-06 15:01:41,604] [DEBUG] [engine.py:1114:create_system] create native OliveSystem SystemType.Local
[2024-08-06 15:01:41,614] [INFO] [engine.py:1121:_create_system] Target system created in 0.009509 seconds
[2024-08-06 15:01:41,614] [INFO] [engine.py:1130:_create_system] Creating host system ...
[2024-08-06 15:01:41,614] [DEBUG] [engine.py:1114:create_system] create native OliveSystem SystemType.Local
[2024-08-06 15:01:41,614] [INFO] [engine.py:1133:_create_system] Host system created in 0.000000 seconds
[2024-08-06 15:01:41,660] [DEBUG] [engine.py:717:_cache_model] Cached model 9139f706 to \Olive-main\examples\whisper\cache\default_workflow\models\9139f706.json
[2024-08-06 15:01:41,662] [DEBUG] [engine.py:352:run_accelerator] Running Olive in no-search mode ...
[2024-08-06 15:01:41,662] [DEBUG] [engine.py:444:run_no_search] Running ['conversion', 'transformers_optimization', 'onnx_dynamic_quantization', 'insert_beam_search', 'prepost'] with no search ...
[2024-08-06 15:01:41,662] [INFO] [engine.py:886:_run_pass] Running pass conversion:OnnxConversion
[2024-08-06 15:01:48,789] [DEBUG] [pytorch.py:194:get_dummy_inputs] Using dummy_inputs_func to get dummy inputs
[2024-08-06 15:01:51,423] [DEBUG] [conversion.py:196:_export_pytorch_model] Converting model on device cpu with dtype None.
[2024-08-06 15:01:56,203] [DEBUG] [pytorch.py:194:get_dummy_inputs] Using dummy_inputs_func to get dummy inputs
[2024-08-06 15:01:56,558] [DEBUG] [conversion.py:196:_export_pytorch_model] Converting model on device cpu with dtype None.
[2024-08-06 15:01:59,113] [INFO] [engine.py:988:_run_pass] Pass conversion:OnnxConversion finished in 17.451246 seconds
[2024-08-06 15:01:59,117] [DEBUG] [engine.py:717:_cache_model] Cached model 0_OnnxConversion-9139f706-5fa0d4af to \Olive-main\examples\whisper\cache\default_workflow\models\0_OnnxConversion-9139f706-5fa0d4af.json
[2024-08-06 15:01:59,120] [DEBUG] [engine.py:769:_cache_run] Cached run for 9139f706->0_OnnxConversion-9139f706-5fa0d4af into \Olive-main\examples\whisper\cache\default_workflow\runs\OnnxConversion-9139f706-5fa0d4af.json
[2024-08-06 15:01:59,122] [INFO] [engine.py:886:_run_pass] Running pass transformers_optimization:OrtTransformersOptimization
[2024-08-06 15:01:59,232] [DEBUG] [transformer_optimization.py:248:_run_for_config] model_type is set to bart from model attributes
[2024-08-06 15:01:59,233] [DEBUG] [transformer_optimization.py:254:_run_for_config] num_heads is set to 6 from model attributes
[2024-08-06 15:01:59,234] [DEBUG] [transformer_optimization.py:260:_run_for_config] hidden_size is set to 384 from model attributes
[2024-08-06 15:02:04,419] [DEBUG] [transformer_optimization.py:248:_run_for_config] model_type is set to bart from model attributes
[2024-08-06 15:02:04,419] [DEBUG] [transformer_optimization.py:254:_run_for_config] num_heads is set to 6 from model attributes
[2024-08-06 15:02:04,419] [DEBUG] [transformer_optimization.py:260:_run_for_config] hidden_size is set to 384 from model attributes
[2024-08-06 15:02:07,900] [INFO] [engine.py:988:_run_pass] Pass transformers_optimization:OrtTransformersOptimization finished in 8.773139 seconds
[2024-08-06 15:02:07,905] [DEBUG] [engine.py:717:_cache_model] Cached model 1_OrtTransformersOptimization-0-5c93fa9e-cpu-cpu to \Olive-main\examples\whisper\cache\default_workflow\models\1_OrtTransformersOptimization-0-5c93fa9e-cpu-cpu.json
[2024-08-06 15:02:07,905] [DEBUG] [engine.py:769:_cache_run] Cached run for 0_OnnxConversion-9139f706-5fa0d4af->1_OrtTransformersOptimization-0-5c93fa9e-cpu-cpu into \Olive-main\examples\whisper\cache\default_workflow\runs\OrtTransformersOptimization-0-5c93fa9e-cpu-cpu.json
[2024-08-06 15:02:07,905] [INFO] [engine.py:886:_run_pass] Running pass onnx_dynamic_quantization:OnnxDynamicQuantization
[2024-08-06 15:02:07,986] [INFO] [quantization.py:391:_run_for_config] Preprocessing model for quantization
[2024-08-06 15:02:11,336] [INFO] [quantization.py:391:_run_for_config] Preprocessing model for quantization
[2024-08-06 15:02:13,823] [INFO] [engine.py:988:_run_pass] Pass onnx_dynamic_quantization:OnnxDynamicQuantization finished in 5.917982 seconds
[2024-08-06 15:02:13,823] [DEBUG] [engine.py:717:_cache_model] Cached model 2_OnnxDynamicQuantization-1-a1261e22 to \Olive-main\examples\whisper\cache\default_workflow\models\2_OnnxDynamicQuantization-1-a1261e22.json
[2024-08-06 15:02:13,823] [DEBUG] [engine.py:769:_cache_run] Cached run for 1_OrtTransformersOptimization-0-5c93fa9e-cpu-cpu->2_OnnxDynamicQuantization-1-a1261e22 into \Olive-main\examples\whisper\cache\default_workflow\runs\OnnxDynamicQuantization-1-a1261e22.json
[2024-08-06 15:02:13,823] [INFO] [engine.py:886:_run_pass] Running pass insert_beam_search:InsertBeamSearch
Removed 67 initializers with duplicated value
Removed 33 initializers with duplicated value
[2024-08-06 15:02:16,653] [DEBUG] [insert_beam_search.py:302:chain_model] Using IR version 8 for chained model
[2024-08-06 15:02:17,329] [INFO] [engine.py:988:_run_pass] Pass insert_beam_search:InsertBeamSearch finished in 3.505282 seconds
[2024-08-06 15:02:17,329] [DEBUG] [engine.py:717:_cache_model] Cached model 3_InsertBeamSearch-2-82bf64f8 to \Olive-main\examples\whisper\cache\default_workflow\models\3_InsertBeamSearch-2-82bf64f8.json
[2024-08-06 15:02:17,329] [DEBUG] [engine.py:769:_cache_run] Cached run for 2_OnnxDynamicQuantization-1-a1261e22->3_InsertBeamSearch-2-82bf64f8 into \Olive-main\examples\whisper\cache\default_workflow\runs\InsertBeamSearch-2-82bf64f8.json
[2024-08-06 15:02:17,336] [INFO] [engine.py:886:_run_pass] Running pass prepost:AppendPrePostProcessingOps
[2024-08-06 15:02:18,924] [INFO] [engine.py:988:_run_pass] Pass prepost:AppendPrePostProcessingOps finished in 1.587309 seconds
[2024-08-06 15:02:18,936] [DEBUG] [engine.py:717:_cache_model] Cached model 4_AppendPrePostProcessingOps-3-9e247843 to \Olive-main\examples\whisper\cache\default_workflow\models\4_AppendPrePostProcessingOps-3-9e247843.json
[2024-08-06 15:02:18,939] [DEBUG] [engine.py:769:_cache_run] Cached run for 3_InsertBeamSearch-2-82bf64f8->4_AppendPrePostProcessingOps-3-9e247843 into \Olive-main\examples\whisper\cache\default_workflow\runs\AppendPrePostProcessingOps-3-9e247843.json
[2024-08-06 15:02:18,939] [INFO] [engine.py:862:_run_passes] Run model evaluation for the final model...
[2024-08-06 15:02:18,939] [DEBUG] [engine.py:1059:_evaluate_model] Evaluating model ...
[2024-08-06 15:02:20,189] [DEBUG] [ort_inference.py:72:get_ort_inference_session] inference_settings: {'execution_provider': ['CPUExecutionProvider'], 'provider_options': None}
[2024-08-06 15:02:20,189] [DEBUG] [ort_inference.py:111:get_ort_inference_session] Normalized providers: ['CPUExecutionProvider'], provider_options: [{}]
[2024-08-06 15:03:18,633] [DEBUG] [footprint.py:234:_resolve_metrics] There is no goal set for metric: latency-avg.
[2024-08-06 15:03:18,636] [DEBUG] [engine.py:864:_run_passes] Signal: {
"latency-avg": 1824.62912
}
[2024-08-06 15:03:19,964] [INFO] [engine.py:378:run_accelerator] Save footprint to models\whisper_cpu_int8_cpu-cpu_footprints.json.
[2024-08-06 15:03:19,970] [DEBUG] [engine.py:380:run_accelerator] run_accelerator done
[2024-08-06 15:03:19,970] [INFO] [engine.py:294:run] Run history for cpu-cpu:
[2024-08-06 15:03:21,520] [INFO] [engine.py:591:dump_run_history] run history:
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+-----------------------------+
| model_id | parent_model_id | from_pass | duration_sec | metrics |
+==================================================+==================================================+=============================+================+=============================+
| 9139f706 | | | | |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+-----------------------------+
| 0_OnnxConversion-9139f706-5fa0d4af | 9139f706 | OnnxConversion | 17.4512 | |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+-----------------------------+
| 1_OrtTransformersOptimization-0-5c93fa9e-cpu-cpu | 0_OnnxConversion-9139f706-5fa0d4af | OrtTransformersOptimization | 8.77314 | |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+-----------------------------+
| 2_OnnxDynamicQuantization-1-a1261e22 | 1_OrtTransformersOptimization-0-5c93fa9e-cpu-cpu | OnnxDynamicQuantization | 5.91798 | |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+-----------------------------+
| 3_InsertBeamSearch-2-82bf64f8 | 2_OnnxDynamicQuantization-1-a1261e22 | InsertBeamSearch | 3.50528 | |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+-----------------------------+
| 4_AppendPrePostProcessingOps-3-9e247843 | 3_InsertBeamSearch-2-82bf64f8 | AppendPrePostProcessingOps | 1.58731 | { |
| | | | | "latency-avg": 1824.62912 |
| | | | | } |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+-----------------------------+
[2024-08-06 15:03:21,770] [INFO] [engine.py:309:run] No packaging config provided, skip packaging artifacts
Describe the bug
Unable to optimize a model with device- cpu and precision int8. Ending up with KeyError: 'input_model' error
To Reproduce
Start with this example: https://github.com/microsoft/onnxruntime-inference-examples/tree/main/js/ort-whisper
Readme says:
Goto: https://github.com/microsoft/Olive/tree/main/examples/whisper and follow the instructions.
Run the following commands
When I did the above with a pip install of olive-ai, I go the KeyError: 'config' error.
Then I tried installing from source as mentioned here - https://github.com/microsoft/Olive/blob/main/examples/README.md
git clone https://github.com/microsoft/Olive.git
cd Olive
python -m pip install .
Then I tried to "Run the config to optimize the model" from here - https://github.com/microsoft/Olive/blob/main/examples/whisper/README.md
This script runs and creates \Olive-main\examples\whisper\models\conversion-transformers_optimization-onnx_dynamic_quantization-insert_beam_search-prepost\whisper_cpu_int8_cpu-cpu_model.onnx
(olive_env) \Olive-main\examples\whisper>python test_transcription.py --config \Olive-main\examples\whisper\models\conversion-transformers_optimization-onnx_dynamic_quantization-insert_beam_search-prepost\whisper_cpu_int8_cpu-cpu_model.json
Traceback (most recent call last):
File "\Olive-main\examples\whisper\test_transcription.py", line 126, in
output_text = main()
^^^^^^
File "\Olive-main\examples\whisper\test_transcription.py", line 63, in main
model_name = config["input_model"]["model_components"][0]["model_path"]
~~~~~~^^^^^^^^^^^^^^^
KeyError: 'input_model'
I rename this model to whisper_cpu_int8_0_model.onnx and go back to the sample at https://github.com/microsoft/onnxruntime-inference-examples/tree/main/js/ort-whisper and try to run the model in the browser and get the following error:
Error: Error: invalid input 'attention_mask'
Expected behavior
I should get a model that runs successfully with onnxruntime-web
Olive config
Add Olive configurations here.
Olive logs
(olive_env) \Olive-main\examples\whisper>python prepare_whisper_configs.py --model_name openai/whisper-tiny.en
config.json: 100%|████████████████████████████████████████████████████████████████████████| 1.94k/1.94k [00:00<?, ?B/s]
(olive_env) \Olive-main\examples\whisper>olive run --config whisper_cpu_int8.json --setup
[2024-08-06 15:01:08,786] [INFO] [run.py:90:get_required_packages] The following packages are required in the local environment: ['onnxruntime']
[2024-08-06 15:01:08,786] [INFO] [run.py:101:install_packages] installing packages: ['onnxruntime']
[2024-08-06 15:01:08,869] [INFO] [run.py:356:check_local_ort_installation] onnxruntime is already installed.
(olive_env) \Olive-main\examples\whisper>olive run --config whisper_cpu_int8.json 2> NUL
[2024-08-06 15:01:41,553] [INFO] [run.py:140:run_engine] Running workflow default_workflow
[2024-08-06 15:01:41,560] [INFO] [cache.py:51:init] Using cache directory: \Olive-main\examples\whisper\cache\default_workflow
[2024-08-06 15:01:41,570] [INFO] [engine.py:1020:save_olive_config] Saved Olive config to \Olive-main\examples\whisper\cache\default_workflow\olive_config.json
[2024-08-06 15:01:41,570] [DEBUG] [run.py:179:run_engine] Registering pass onnxconversion
[2024-08-06 15:01:41,570] [DEBUG] [run.py:179:run_engine] Registering pass orttransformersoptimization
[2024-08-06 15:01:41,570] [DEBUG] [run.py:179:run_engine] Registering pass onnxdynamicquantization
[2024-08-06 15:01:41,570] [DEBUG] [run.py:179:run_engine] Registering pass insertbeamsearch
[2024-08-06 15:01:41,570] [DEBUG] [run.py:179:run_engine] Registering pass appendprepostprocessingops
[2024-08-06 15:01:41,583] [DEBUG] [accelerator_creator.py:130:_fill_accelerators] The accelerator device and execution providers are specified, skipping deduce.
[2024-08-06 15:01:41,583] [DEBUG] [accelerator_creator.py:169:_check_execution_providers] Supported execution providers for device cpu: ['CPUExecutionProvider']
[2024-08-06 15:01:41,586] [DEBUG] [accelerator_creator.py:199:create_accelerators] Initial accelerators and execution providers: {'cpu': ['CPUExecutionProvider']}
[2024-08-06 15:01:41,586] [INFO] [accelerator_creator.py:224:create_accelerators] Running workflow on accelerator specs: cpu-cpu
[2024-08-06 15:01:41,586] [DEBUG] [run.py:235:run_engine] Pass onnxconversion already registered
[2024-08-06 15:01:41,586] [DEBUG] [run.py:235:run_engine] Pass orttransformersoptimization already registered
[2024-08-06 15:01:41,586] [DEBUG] [run.py:235:run_engine] Pass onnxdynamicquantization already registered
[2024-08-06 15:01:41,586] [DEBUG] [run.py:235:run_engine] Pass insertbeamsearch already registered
[2024-08-06 15:01:41,586] [DEBUG] [run.py:235:run_engine] Pass appendprepostprocessingops already registered
[2024-08-06 15:01:41,586] [DEBUG] [cache.py:304:set_cache_env] Set OLIVE_CACHE_DIR: \Olive-main\examples\whisper\cache\default_workflow
[2024-08-06 15:01:41,604] [INFO] [engine.py:277:run] Running Olive on accelerator: cpu-cpu
[2024-08-06 15:01:41,604] [INFO] [engine.py:1118:_create_system] Creating target system ...
[2024-08-06 15:01:41,604] [DEBUG] [engine.py:1114:create_system] create native OliveSystem SystemType.Local
[2024-08-06 15:01:41,614] [INFO] [engine.py:1121:_create_system] Target system created in 0.009509 seconds
[2024-08-06 15:01:41,614] [INFO] [engine.py:1130:_create_system] Creating host system ...
[2024-08-06 15:01:41,614] [DEBUG] [engine.py:1114:create_system] create native OliveSystem SystemType.Local
[2024-08-06 15:01:41,614] [INFO] [engine.py:1133:_create_system] Host system created in 0.000000 seconds
[2024-08-06 15:01:41,660] [DEBUG] [engine.py:717:_cache_model] Cached model 9139f706 to \Olive-main\examples\whisper\cache\default_workflow\models\9139f706.json
[2024-08-06 15:01:41,662] [DEBUG] [engine.py:352:run_accelerator] Running Olive in no-search mode ...
[2024-08-06 15:01:41,662] [DEBUG] [engine.py:444:run_no_search] Running ['conversion', 'transformers_optimization', 'onnx_dynamic_quantization', 'insert_beam_search', 'prepost'] with no search ...
[2024-08-06 15:01:41,662] [INFO] [engine.py:886:_run_pass] Running pass conversion:OnnxConversion
[2024-08-06 15:01:48,789] [DEBUG] [pytorch.py:194:get_dummy_inputs] Using dummy_inputs_func to get dummy inputs
[2024-08-06 15:01:51,423] [DEBUG] [conversion.py:196:_export_pytorch_model] Converting model on device cpu with dtype None.
[2024-08-06 15:01:56,203] [DEBUG] [pytorch.py:194:get_dummy_inputs] Using dummy_inputs_func to get dummy inputs
[2024-08-06 15:01:56,558] [DEBUG] [conversion.py:196:_export_pytorch_model] Converting model on device cpu with dtype None.
[2024-08-06 15:01:59,113] [INFO] [engine.py:988:_run_pass] Pass conversion:OnnxConversion finished in 17.451246 seconds
[2024-08-06 15:01:59,117] [DEBUG] [engine.py:717:_cache_model] Cached model 0_OnnxConversion-9139f706-5fa0d4af to \Olive-main\examples\whisper\cache\default_workflow\models\0_OnnxConversion-9139f706-5fa0d4af.json
[2024-08-06 15:01:59,120] [DEBUG] [engine.py:769:_cache_run] Cached run for 9139f706->0_OnnxConversion-9139f706-5fa0d4af into \Olive-main\examples\whisper\cache\default_workflow\runs\OnnxConversion-9139f706-5fa0d4af.json
[2024-08-06 15:01:59,122] [INFO] [engine.py:886:_run_pass] Running pass transformers_optimization:OrtTransformersOptimization
[2024-08-06 15:01:59,232] [DEBUG] [transformer_optimization.py:248:_run_for_config] model_type is set to bart from model attributes
[2024-08-06 15:01:59,233] [DEBUG] [transformer_optimization.py:254:_run_for_config] num_heads is set to 6 from model attributes
[2024-08-06 15:01:59,234] [DEBUG] [transformer_optimization.py:260:_run_for_config] hidden_size is set to 384 from model attributes
[2024-08-06 15:02:04,419] [DEBUG] [transformer_optimization.py:248:_run_for_config] model_type is set to bart from model attributes
[2024-08-06 15:02:04,419] [DEBUG] [transformer_optimization.py:254:_run_for_config] num_heads is set to 6 from model attributes
[2024-08-06 15:02:04,419] [DEBUG] [transformer_optimization.py:260:_run_for_config] hidden_size is set to 384 from model attributes
[2024-08-06 15:02:07,900] [INFO] [engine.py:988:_run_pass] Pass transformers_optimization:OrtTransformersOptimization finished in 8.773139 seconds
[2024-08-06 15:02:07,905] [DEBUG] [engine.py:717:_cache_model] Cached model 1_OrtTransformersOptimization-0-5c93fa9e-cpu-cpu to \Olive-main\examples\whisper\cache\default_workflow\models\1_OrtTransformersOptimization-0-5c93fa9e-cpu-cpu.json
[2024-08-06 15:02:07,905] [DEBUG] [engine.py:769:_cache_run] Cached run for 0_OnnxConversion-9139f706-5fa0d4af->1_OrtTransformersOptimization-0-5c93fa9e-cpu-cpu into \Olive-main\examples\whisper\cache\default_workflow\runs\OrtTransformersOptimization-0-5c93fa9e-cpu-cpu.json
[2024-08-06 15:02:07,905] [INFO] [engine.py:886:_run_pass] Running pass onnx_dynamic_quantization:OnnxDynamicQuantization
[2024-08-06 15:02:07,986] [INFO] [quantization.py:391:_run_for_config] Preprocessing model for quantization
[2024-08-06 15:02:11,336] [INFO] [quantization.py:391:_run_for_config] Preprocessing model for quantization
[2024-08-06 15:02:13,823] [INFO] [engine.py:988:_run_pass] Pass onnx_dynamic_quantization:OnnxDynamicQuantization finished in 5.917982 seconds
[2024-08-06 15:02:13,823] [DEBUG] [engine.py:717:_cache_model] Cached model 2_OnnxDynamicQuantization-1-a1261e22 to \Olive-main\examples\whisper\cache\default_workflow\models\2_OnnxDynamicQuantization-1-a1261e22.json
[2024-08-06 15:02:13,823] [DEBUG] [engine.py:769:_cache_run] Cached run for 1_OrtTransformersOptimization-0-5c93fa9e-cpu-cpu->2_OnnxDynamicQuantization-1-a1261e22 into \Olive-main\examples\whisper\cache\default_workflow\runs\OnnxDynamicQuantization-1-a1261e22.json
[2024-08-06 15:02:13,823] [INFO] [engine.py:886:_run_pass] Running pass insert_beam_search:InsertBeamSearch
Removed 67 initializers with duplicated value
Removed 33 initializers with duplicated value
[2024-08-06 15:02:16,653] [DEBUG] [insert_beam_search.py:302:chain_model] Using IR version 8 for chained model
[2024-08-06 15:02:17,329] [INFO] [engine.py:988:_run_pass] Pass insert_beam_search:InsertBeamSearch finished in 3.505282 seconds
[2024-08-06 15:02:17,329] [DEBUG] [engine.py:717:_cache_model] Cached model 3_InsertBeamSearch-2-82bf64f8 to \Olive-main\examples\whisper\cache\default_workflow\models\3_InsertBeamSearch-2-82bf64f8.json
[2024-08-06 15:02:17,329] [DEBUG] [engine.py:769:_cache_run] Cached run for 2_OnnxDynamicQuantization-1-a1261e22->3_InsertBeamSearch-2-82bf64f8 into \Olive-main\examples\whisper\cache\default_workflow\runs\InsertBeamSearch-2-82bf64f8.json
[2024-08-06 15:02:17,336] [INFO] [engine.py:886:_run_pass] Running pass prepost:AppendPrePostProcessingOps
[2024-08-06 15:02:18,924] [INFO] [engine.py:988:_run_pass] Pass prepost:AppendPrePostProcessingOps finished in 1.587309 seconds
[2024-08-06 15:02:18,936] [DEBUG] [engine.py:717:_cache_model] Cached model 4_AppendPrePostProcessingOps-3-9e247843 to \Olive-main\examples\whisper\cache\default_workflow\models\4_AppendPrePostProcessingOps-3-9e247843.json
[2024-08-06 15:02:18,939] [DEBUG] [engine.py:769:_cache_run] Cached run for 3_InsertBeamSearch-2-82bf64f8->4_AppendPrePostProcessingOps-3-9e247843 into \Olive-main\examples\whisper\cache\default_workflow\runs\AppendPrePostProcessingOps-3-9e247843.json
[2024-08-06 15:02:18,939] [INFO] [engine.py:862:_run_passes] Run model evaluation for the final model...
[2024-08-06 15:02:18,939] [DEBUG] [engine.py:1059:_evaluate_model] Evaluating model ...
[2024-08-06 15:02:20,189] [DEBUG] [ort_inference.py:72:get_ort_inference_session] inference_settings: {'execution_provider': ['CPUExecutionProvider'], 'provider_options': None}
[2024-08-06 15:02:20,189] [DEBUG] [ort_inference.py:111:get_ort_inference_session] Normalized providers: ['CPUExecutionProvider'], provider_options: [{}]
[2024-08-06 15:03:18,633] [DEBUG] [footprint.py:234:_resolve_metrics] There is no goal set for metric: latency-avg.
[2024-08-06 15:03:18,636] [DEBUG] [engine.py:864:_run_passes] Signal: {
"latency-avg": 1824.62912
}
[2024-08-06 15:03:19,964] [INFO] [engine.py:378:run_accelerator] Save footprint to models\whisper_cpu_int8_cpu-cpu_footprints.json.
[2024-08-06 15:03:19,970] [DEBUG] [engine.py:380:run_accelerator] run_accelerator done
[2024-08-06 15:03:19,970] [INFO] [engine.py:294:run] Run history for cpu-cpu:
[2024-08-06 15:03:21,520] [INFO] [engine.py:591:dump_run_history] run history:
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+-----------------------------+
| model_id | parent_model_id | from_pass | duration_sec | metrics |
+==================================================+==================================================+=============================+================+=============================+
| 9139f706 | | | | |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+-----------------------------+
| 0_OnnxConversion-9139f706-5fa0d4af | 9139f706 | OnnxConversion | 17.4512 | |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+-----------------------------+
| 1_OrtTransformersOptimization-0-5c93fa9e-cpu-cpu | 0_OnnxConversion-9139f706-5fa0d4af | OrtTransformersOptimization | 8.77314 | |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+-----------------------------+
| 2_OnnxDynamicQuantization-1-a1261e22 | 1_OrtTransformersOptimization-0-5c93fa9e-cpu-cpu | OnnxDynamicQuantization | 5.91798 | |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+-----------------------------+
| 3_InsertBeamSearch-2-82bf64f8 | 2_OnnxDynamicQuantization-1-a1261e22 | InsertBeamSearch | 3.50528 | |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+-----------------------------+
| 4_AppendPrePostProcessingOps-3-9e247843 | 3_InsertBeamSearch-2-82bf64f8 | AppendPrePostProcessingOps | 1.58731 | { |
| | | | | "latency-avg": 1824.62912 |
| | | | | } |
+--------------------------------------------------+--------------------------------------------------+-----------------------------+----------------+-----------------------------+
[2024-08-06 15:03:21,770] [INFO] [engine.py:309:run] No packaging config provided, skip packaging artifacts
Other information
Additional context
Tying to run this sample - https://github.com/microsoft/onnxruntime-inference-examples/tree/main/js/ort-whisper
The text was updated successfully, but these errors were encountered: