Unable to export model by using --device npu --provider QNNExecutionProvider #1595

huanji-sun-007 · 2025-02-05T00:32:12Z

Describe the bug
Hi
I am trying to quantize and export a fine-tuned microsoft/phi-3.5-mini-instruct model.
I was able to export it by using --device cpu --provider CPUExecutionProvider.
However when I try to export it by using --device npu --provider QNNExecutionProvider I got the following error.

ValueError: Message onnx.ModelProto exceeds maximum protobuf size of 2GB: 2729203422

To Reproduce
Steps to reproduce the behavior.

olive quantize \
   --model_name_or_path microsoft/Phi-3.5-mini-instruct \
   --trust_remote_code \
   --algorithm awq \
   --output_path outputs/models/awq \
   --log_level 1

olive auto-opt \
   --model_name_or_path outputs/models/awq/model \
   --output_path outputs/models/onnx-quant \
   --device npu \
   --provider QNNExecutionProvider \
   --dynamic-to-fixed-shape-dim-param batch \
   --dynamic-to-fixed-shape-dim-value 1 \
   --precision int4 \
   --batch_size 1 \
   --use_ort_genai \
   --log_level 1

Expected behavior
Be able to export onnx model by using --device npu --provider QNNExecutionProvider

Olive config

olive-ai[ort-genai,auto-opt]==0.7.1.1
autoawq==0.2.7.post2
auto-gptq==0.7.1
transformers==4.44.2
optimum==1.23.1
peft==0.13.2
accelerate==1.1.1
scipy==1.14.1
onnxruntime-genai==0.5.0
torchvision==0.18.1
tabulate==0.9.0
onnxruntime-genai-cuda==0.5.0
torch==2.3.1
torchvision==0.18.1

Olive logs

Traceback (most recent call last):
  File "/opt/conda/envs/ptca/bin/olive", line 8, in <module>
    sys.exit(main())
  File "/opt/conda/envs/ptca/lib/python3.10/site-packages/olive/cli/launcher.py", line 62, in main
    service.run()
  File "/opt/conda/envs/ptca/lib/python3.10/site-packages/olive/cli/auto_opt.py", line 183, in run
    olive_run(run_config)
  File "/opt/conda/envs/ptca/lib/python3.10/site-packages/olive/workflows/run/run.py", line 317, in run
    return run_engine(package_config, run_config)
  File "/opt/conda/envs/ptca/lib/python3.10/site-packages/olive/workflows/run/run.py", line 259, in run_engine
    engine.run(
  File "/opt/conda/envs/ptca/lib/python3.10/site-packages/olive/engine/engine.py", line 252, in run
    run_result = self.run_accelerator(
  File "/opt/conda/envs/ptca/lib/python3.10/site-packages/olive/engine/engine.py", line 330, in run_accelerator
    output_footprint = self.run_no_search(input_model_config, input_model_id, accelerator_spec, output_dir)
  File "/opt/conda/envs/ptca/lib/python3.10/site-packages/olive/engine/engine.py", line 400, in run_no_search
    should_prune, signal, model_ids = self._run_passes(
  File "/opt/conda/envs/ptca/lib/python3.10/site-packages/olive/engine/engine.py", line 664, in _run_passes
    model_config, model_id = self._run_pass(
  File "/opt/conda/envs/ptca/lib/python3.10/site-packages/olive/engine/engine.py", line 764, in _run_pass
    output_model_config = host.run_pass(p, input_model_config, output_model_path, pass_search_point)
  File "/opt/conda/envs/ptca/lib/python3.10/site-packages/olive/systems/local.py", line 30, in run_pass
    output_model = the_pass.run(model, output_model_path, point)
  File "/opt/conda/envs/ptca/lib/python3.10/site-packages/olive/passes/olive_pass.py", line 245, in run
    output_model = self._run_for_config(model, config, output_model_path)
  File "/opt/conda/envs/ptca/lib/python3.10/site-packages/olive/passes/onnx/dynamic_to_fixed_shape.py", line 81, in _run_for_config
    fix_output_shapes(onnx_model)
  File "/opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/tools/onnx_model_utils.py", line 242, in fix_output_shapes
    m2 = onnx.shape_inference.infer_shapes(model)
  File "/opt/conda/envs/ptca/lib/python3.10/site-packages/onnx/shape_inference.py", line 45, in infer_shapes
    model_str = model if isinstance(model, bytes) else model.SerializeToString()
ValueError: Message onnx.ModelProto exceeds maximum protobuf size of 2GB: 2729203422

Other information

OS: Linux
Olive version: 0.7.1.1
ONNXRuntime package and version: onnxruntime-genai-cuda==0.5.0
Transformers package version: 4.44.2

Additional context
I found a similar issue here, which is also an ONNX conversion issue when the file size exceeds 2GB.
#1165

The text was updated successfully, but these errors were encountered:

jambayk · 2025-02-06T19:08:03Z

Thanks for reporting the bug! I created #1600 to fix this.

Please note that the --provider QNNExecutionProvider option currently doesn't produce a model compatible with the qnn ep. It involves a more complicated workflow that we are actively testing. I will let you know once we have a working example. It will take some more time after that to bring the changes into the auto-opt tool.

…1600) ## Describe your changes - `onnxruntime.tools.onnx_model_utils.fix_output_shapes` cannot handle large models (#1595), so we use the ort shape infer helper and handle the logic ourselves. This also means it can now handle models with contrib operators too. - allow passing 0 as dim_value. this case is possible when creating a prompt processing model from a dynamic shaped llm where we want to make the past kv cache empty. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link

fujikosu mentioned this issue Feb 6, 2025

olive auto-opt --num-splits fails with FileNotFoundError: [Errno 2] No such file or directory: '...../model.onnx' #1599

Open

jambayk mentioned this issue Feb 6, 2025

DynamicToFixedShape: use ort shape infer tool, accept 0 as dim_value #1600

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to export model by using --device npu --provider QNNExecutionProvider #1595

Unable to export model by using --device npu --provider QNNExecutionProvider #1595

huanji-sun-007 commented Feb 5, 2025 •

edited

Loading

jambayk commented Feb 6, 2025 •

edited

Loading

Unable to export model by using --device npu --provider QNNExecutionProvider #1595

Unable to export model by using --device npu --provider QNNExecutionProvider #1595

Comments

huanji-sun-007 commented Feb 5, 2025 • edited Loading

jambayk commented Feb 6, 2025 • edited Loading

huanji-sun-007 commented Feb 5, 2025 •

edited

Loading

jambayk commented Feb 6, 2025 •

edited

Loading