Mistral optimization(GPU) for a locally saved model, Failed to run Olive on gpu-cuda. #1341

tjinjin95 · 2024-08-31T10:02:27Z

Describe the bug
Failed to run Olive on gpu-cuda.

To Reproduce
Download https://huggingface.co/mistralai/Mistral-7B-v0.1/tree/main to folder: D:\windowsAI\HFModel\Mistral-7B-v01
Follow readme: https://github.com/microsoft/Olive/tree/main/examples/mistral
Running step: python mistral.py --optimize --config mistral_fp16_optimize.json --model_id D:\windowsAI\HFModel\Mistral-7B-v01
If the method is not right, would you help to list the right methods.

my virtual environment pip list:
Package Version Editable project location

accelerate 0.33.0
aiohappyeyeballs 2.4.0
aiohttp 3.10.5
aiosignal 1.3.1
alembic 1.13.2
annotated-types 0.7.0
attrs 24.2.0
certifi 2024.7.4
charset-normalizer 3.3.2
colorama 0.4.6
coloredlogs 15.0.1
colorlog 6.8.2
contourpy 1.2.1
cycler 0.12.1
datasets 2.21.0
Deprecated 1.2.14
dill 0.3.8
evaluate 0.4.2
filelock 3.15.4
flatbuffers 24.3.25
fonttools 4.53.1
frozenlist 1.4.1
fsspec 2024.6.1
greenlet 3.0.3
huggingface-hub 0.24.6
humanfriendly 10.0
idna 3.8
inquirerpy 0.3.4
Jinja2 3.1.4
joblib 1.4.2
kiwisolver 1.4.5
lightning-utilities 0.11.6
Mako 1.3.5
MarkupSafe 2.1.5
matplotlib 3.9.2
mpmath 1.3.0
multidict 6.0.5
multiprocess 0.70.16
networkx 3.3
neural_compressor 3.0
numpy 1.26.4
olive-ai 0.7.0 D:\windowsAI\Olive
onnx 1.16.2
onnxconverter-common 1.14.0
onnxruntime-directml 1.19.0
onnxruntime_extensions 0.12.0
onnxruntime-gpu 1.19.0
opencv-python-headless 4.10.0.84
optimum 1.21.4
optuna 3.6.1
packaging 24.1
pandas 2.2.2
pfzy 0.3.4
pillow 10.4.0
pip 24.2
prettytable 3.11.0
prompt_toolkit 3.0.47
protobuf 3.20.2
psutil 6.0.0
py-cpuinfo 9.0.0
pyarrow 17.0.0
pycocotools 2.0.8
pydantic 2.8.2
pydantic_core 2.20.1
pyparsing 3.1.4
pyreadline3 3.4.1
python-dateutil 2.9.0.post0
pytz 2024.1
PyYAML 6.0.2
regex 2024.7.24
requests 2.32.3
safetensors 0.4.4
schema 0.7.7
scikit-learn 1.5.1
scipy 1.14.1
sentencepiece 0.2.0
setuptools 73.0.1
six 1.16.0
skl2onnx 1.17.0
SQLAlchemy 2.0.32
sympy 1.13.2
tabulate 0.9.0
tf2onnx 1.16.1
threadpoolctl 3.5.0
tokenizers 0.19.1
torch 2.4.0
torchaudio 2.4.0
torchmetrics 1.4.1
torchvision 0.19.0
tqdm 4.66.5
transformers 4.43.4
typing_extensions 4.12.2
tzdata 2024.1
urllib3 2.2.2
wcwidth 0.2.13
wrapt 1.16.0
xxhash 3.5.0
yarl 1.9.4
Expected behavior
generate a optimized model

Olive config
--config mistral_fp16_optimize.json

Olive logs
`(mistral_env) D:\windowsAI\Olive\examples\mistral>python mistral.py --optimize --config mistral_fp16_optimize.json --model_id D:\windowsAI\HFModel\Mistral-7B-v01

optimized_model_dir is:D:\windowsAI\Olive\examples\mistral\models\convert-optimize-perf_tuning\mistral_fp16_gpu-cuda_model
Optimizing D:\windowsAI\HFModel\Mistral-7B-v01
[2024-08-31 17:50:42,659] [INFO] [run.py:138:run_engine] Running workflow default_workflow
[2024-08-31 17:50:42,704] [INFO] [cache.py:51:init] Using cache directory: D:\windowsAI\Olive\examples\mistral\cache\default_workflow
[2024-08-31 17:50:42,757] [INFO] [engine.py:1013:save_olive_config] Saved Olive config to D:\windowsAI\Olive\examples\mistral\cache\default_workflow\olive_config.json
[2024-08-31 17:50:42,846] [INFO] [accelerator_creator.py:224:create_accelerators] Running workflow on accelerator specs: gpu-cuda
[2024-08-31 17:50:42,888] [INFO] [engine.py:275:run] Running Olive on accelerator: gpu-cuda
[2024-08-31 17:50:42,888] [INFO] [engine.py:1110:_create_system] Creating target system ...
[2024-08-31 17:50:42,889] [INFO] [engine.py:1113:_create_system] Target system created in 0.000000 seconds
[2024-08-31 17:50:42,889] [INFO] [engine.py:1122:_create_system] Creating host system ...
[2024-08-31 17:50:42,891] [INFO] [engine.py:1125:_create_system] Host system created in 0.000000 seconds
passes is [('convert', {}), ('optimize', {}), ('perf_tuning', {})]
[2024-08-31 17:50:43,102] [INFO] [engine.py:877:_run_pass] Running pass convert:OptimumConversion
Framework not specified. Using pt to export the model.
[2024-08-31 17:50:54,785] [ERROR] [engine.py:976:_run_pass] Pass run failed.
Traceback (most recent call last):
File "D:\windowsAI\Olive\olive\engine\engine.py", line 964, in _run_pass
output_model_config = host.run_pass(p, input_model_config, output_model_path, pass_search_point)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\systems\local.py", line 30, in run_pass
output_model = the_pass.run(model, output_model_path, point)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\passes\olive_pass.py", line 206, in run
output_model = self._run_for_config(model, config, output_model_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\passes\onnx\optimum_conversion.py", line 96, in run_for_config
export_optimum_model(model.model_name_or_path, output_model_path, **extra_args)
File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\onnx_main.py", line 248, in main_export
task = TasksManager.infer_task_from_model(model_name_or_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\tasks.py", line 1680, in infer_task_from_model
task = cls._infer_task_from_model_name_or_path(model, subfolder=subfolder, revision=revision)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\tasks.py", line 1593, in _infer_task_from_model_name_or_path
raise RuntimeError(
RuntimeError: Cannot infer the task from a local directory yet, please specify the task manually (masked-im, automatic-speech-recognition, fill-mask, object-detection, text2text-generation, text-to-audio, image-to-image, audio-xvector, image-segmentation, mask-generation, zero-shot-object-detection, image-to-text, semantic-segmentation, question-answering, feature-extraction, conversational, token-classification, text-classification, audio-classification, depth-estimation, sentence-similarity, zero-shot-image-classification, audio-frame-classification, multiple-choice, text-generation, image-classification, stable-diffusion, stable-diffusion-xl).
[2024-08-31 17:50:55,193] [WARNING] [engine.py:370:run_accelerator] Failed to run Olive on gpu-cuda.
Traceback (most recent call last):
File "D:\windowsAI\Olive\olive\engine\engine.py", line 349, in run_accelerator
output_footprint = self.run_no_search(
^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\engine\engine.py", line 441, in run_no_search
should_prune, signal, model_ids = self._run_passes(
^^^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\engine\engine.py", line 814, in _run_passes
model_config, model_id, output_model_hash = self._run_pass(
^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\engine\engine.py", line 964, in _run_pass
output_model_config = host.run_pass(p, input_model_config, output_model_path, pass_search_point)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\systems\local.py", line 30, in run_pass
output_model = the_pass.run(model, output_model_path, point)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\passes\olive_pass.py", line 206, in run
output_model = self._run_for_config(model, config, output_model_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\Olive\olive\passes\onnx\optimum_conversion.py", line 96, in run_for_config
export_optimum_model(model.model_name_or_path, output_model_path, **extra_args)
File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\onnx_main.py", line 248, in main_export
task = TasksManager.infer_task_from_model(model_name_or_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\tasks.py", line 1680, in infer_task_from_model
task = cls._infer_task_from_model_name_or_path(model, subfolder=subfolder, revision=revision)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\windowsAI\mistral_env\Lib\site-packages\optimum\exporters\tasks.py", line 1593, in _infer_task_from_model_name_or_path
raise RuntimeError(
RuntimeError: Cannot infer the task from a local directory yet, please specify the task manually (masked-im, automatic-speech-recognition, fill-mask, object-detection, text2text-generation, text-to-audio, image-to-image, audio-xvector, image-segmentation, mask-generation, zero-shot-object-detection, image-to-text, semantic-segmentation, question-answering, feature-extraction, conversational, token-classification, text-classification, audio-classification, depth-estimation, sentence-similarity, zero-shot-image-classification, audio-frame-classification, multiple-choice, text-generation, image-classification, stable-diffusion, stable-diffusion-xl).
[2024-08-31 17:50:55,199] [INFO] [engine.py:292:run] Run history for gpu-cuda:
[2024-08-31 17:50:55,347] [INFO] [engine.py:587:dump_run_history] run history:
+------------+-------------------+-------------+----------------+-----------+
| model_id | parent_model_id | from_pass | duration_sec | metrics |
+============+===================+=============+================+===========+
| d03e43d3 | | | | |
+------------+-------------------+-------------+----------------+-----------+
[2024-08-31 17:50:55,378] [INFO] [engine.py:307:run] No packaging config provided, skip packaging artifacts`

Other information

OS: [Windows]
Olive version: [main]
ONNXRuntime package and version: [e.g. onnxruntime-gpu: 1.19.0]
Transformers package version: [e.g. transformers 4.43.4]
GPU memory: 4G

Additional context
None

jambayk · 2024-09-04T20:51:05Z

Looks like optimum export is failing on local model.

could you try by replacing the "convert" config using this?

{
      "type": "OnnxConversion",
      "target_opset": 17,
      "torch_dtype": "float32"
  }

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mistral optimization(GPU) for a locally saved model, Failed to run Olive on gpu-cuda. #1341

Mistral optimization(GPU) for a locally saved model, Failed to run Olive on gpu-cuda. #1341

tjinjin95 commented Aug 31, 2024

jambayk commented Sep 4, 2024

Mistral optimization(GPU) for a locally saved model, Failed to run Olive on gpu-cuda. #1341

Mistral optimization(GPU) for a locally saved model, Failed to run Olive on gpu-cuda. #1341

Comments

tjinjin95 commented Aug 31, 2024

jambayk commented Sep 4, 2024