Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transformers 4.48 breaks Mistral FastLanguageModel.from_pretrained with NameError: name 'MistralConfig' is not defined #1528

Open
ArnoBen opened this issue Jan 11, 2025 · 2 comments
Labels
currently fixing Am fixing now!

Comments

@ArnoBen
Copy link

ArnoBen commented Jan 11, 2025

Greetings,

I've been trying to run the Mistral_v0.3_(7B)-Conversational notebook on my own server.
I followed the notebook cells and was hit with NameError: name 'MistralConfig' is not defined, from this file:
unsloth/models/mistral.py, Line 310
even though from transformers import MistralConfig works fine.

Since it's running fine on Colab's server, I checked the package versions and noticed that Colab had transformers==4.47.1 whereas I was running 4.48.0.

Downgrading to 4.47.1 fixed the issue but this will currently break when simply pip-installing unsloth calling the method with a Mistral model.

@InshaManowar
Copy link

Traceback (most recent call last): File "/opt/conda/lib/python3.11/site-packages/unsloth/models/llama.py", line 1610, in from_pretrained model_patcher.pre_patch() File "/opt/conda/lib/python3.11/site-packages/unsloth/models/mistral.py", line 310, in pre_patch exec(function, globals()) File "<string>", line 1, in <module> NameError: name 'MistralConfig' is not defined

Got the same error, but downgrading transformer to 4.47.1 worked.

@danielhanchen danielhanchen added the currently fixing Am fixing now! label Jan 14, 2025
@danielhanchen
Copy link
Contributor

Apologies on the delay - working on a fix as we speak!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
currently fixing Am fixing now!
Projects
None yet
Development

No branches or pull requests

3 participants