Fail to convert llama model to mlir #900

bhbruce · 2025-01-09T09:07:39Z

Environment setup

Modify requirement.txt to change version of transformer

-transformers==4.37.1
+transformers==4.40.0

Install packages

pip install --no-compile --pre --upgrade -e models -r models/requirements.txt

Instruction to reproduce error

python3 models/turbine_models/custom_models/stateless_llama.py --hf_model_name="TinyLlama/TinyLlama-1.1B-Chat-v1.0" --compile_to=torch --external_weights="safetensors" --quantization="unquantized" --precision="f16" --external_weight_file=w.safetensors

Log

lib/python3.11/site-packages/iree/turbine/aot/support/procedural/primitives.py", line 209, in _to_meta_tensor
    assert not any(
AssertionError: Unsupported dynamic dims in meta tensor

pip freeze

accelerate==1.2.1
aiohappyeyeballs==2.4.4
aiohttp==3.11.11
aiosignal==1.3.2
attrs==24.3.0
azure-core==1.32.0
azure-storage-blob==12.24.0
brevitas @ git+https://github.com/Xilinx/brevitas.git@6695e8df7f6a2c7715b9ed69c4b78157376bb60b
certifi==2024.12.14
cffi==1.17.1
charset-normalizer==3.4.1
cryptography==44.0.0
datasets==3.0.1
dependencies==2.0.1
diffusers @ git+https://github.com/nod-ai/diffusers@8fe5c93c70cd985dd589424d40a0116253300b4f
dill==0.3.8
einops==0.8.0
filelock==3.16.1
frozenlist==1.5.0
fsspec==2024.6.1
gguf==0.14.0
huggingface-hub==0.22.2
idna==3.10
importlib_metadata==8.5.0
iniconfig==2.0.0
iree-base-compiler==3.1.0
iree-base-runtime==3.1.0
iree-compiler==20241104.1068
iree-runtime==20241104.1068
iree-turbine @ git+https://github.com/iree-org/iree-turbine.git@e4550f37dcd8b9b691db93c30b478c1d67eee83b
isodate==0.7.2
Jinja2==3.1.5
MarkupSafe==3.0.2
ml_dtypes==0.5.1
mpmath==1.3.0
multidict==6.1.0
multiprocess==0.70.16
networkx==3.4.2
numpy==2.2.1
nvidia-cublas-cu12==12.4.5.8
nvidia-cuda-cupti-cu12==12.4.127
nvidia-cuda-nvrtc-cu12==12.4.127
nvidia-cuda-runtime-cu12==12.4.127
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu12==11.2.1.3
nvidia-curand-cu12==10.3.5.147
nvidia-cusolver-cu12==11.6.1.9
nvidia-cusparse-cu12==12.3.1.170
nvidia-nccl-cu12==2.21.5
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvtx-cu12==12.4.127
packaging==24.2
pandas==2.2.3
peft==0.13.2
pillow==11.1.0
pluggy==1.5.0
propcache==0.2.1
protobuf==5.29.3
psutil==6.1.1
pyarrow==18.1.0
pycparser==2.22
pytest==8.3.4
python-dateutil==2.9.0.post0
pytz==2024.2
PyYAML==6.0.2
regex==2024.11.6
requests==2.32.3
safetensors==0.5.2
scipy==1.15.0
sentencepiece==0.2.0
shark-turbine==2.4.1
-e git+https://github.com/nod-ai/sharktank.git@7849f8eb49c7519da48aa794322962211bc9b091#egg=sharktank
six==1.17.0
sympy==1.13.1
tokenizers==0.19.1
torch==2.5.1
torchsde==0.2.6
tqdm==4.67.1
trampoline==0.1.2
transformers==4.40.0
triton==3.1.0
-e git+ssh://[email protected]/nod-ai/SHARK-ModelDev.git@d551ab1d0656831f945af7bafccdf80912d50615#egg=turbine_models&subdirectory=models
typing_extensions==4.12.2
tzdata==2024.2
unfoldNd==0.2.3
urllib3==2.3.0
xxhash==3.5.0
yarl==1.18.3
zipp==3.21.0

The text was updated successfully, but these errors were encountered:

bhbruce · 2025-01-09T09:14:10Z

@ScottTodd Could you help to this issue?

ScottTodd · 2025-01-09T16:28:37Z

stateless_llama.py has been unmaintained for almost a year at this point. The path we are investing in now is documented at https://github.com/nod-ai/shark-ai/blob/main/docs/shortfin/llm/user/llama_serving.md. We'll be streamlining the workflows and documentation there soon, but the MLIR export step is here: https://github.com/nod-ai/shark-ai/blob/main/docs/shortfin/llm/user/llama_serving.md#export-to-mlir-using-sharktank, and I expect that should at least compile and run for CPU and ROCm/HIP. Might also work on Vulkan/CUDA/Metal/etc. but that is less tested.

bhbruce · 2025-01-10T08:05:31Z

OK. I still see some updates in this Repo for onnx. Does this repo deprecated?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fail to convert llama model to mlir #900

Fail to convert llama model to mlir #900

bhbruce commented Jan 9, 2025

bhbruce commented Jan 9, 2025

ScottTodd commented Jan 9, 2025

bhbruce commented Jan 10, 2025

Fail to convert llama model to mlir #900

Fail to convert llama model to mlir #900

Comments

bhbruce commented Jan 9, 2025

Environment setup

Instruction to reproduce error

Log

pip freeze

bhbruce commented Jan 9, 2025

ScottTodd commented Jan 9, 2025

bhbruce commented Jan 10, 2025