Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fix model sizes in supported models lists #167

Merged
merged 6 commits into from
Apr 1, 2024
Merged

Conversation

joein
Copy link
Member

@joein joein commented Mar 29, 2024

No description provided.

@joein
Copy link
Member Author

joein commented Mar 29, 2024

Hey @NirantK @Anush008

I went through list of supported models and found some discrepancies.
I tried to fix them, could you please check if I am right about this fixes, or not?

There were various cases, I'll try to describe how I chose the sizes:

  1. When there is only model.onnx available - I simply took its size
  2. When there are model.onnx and model.onnx_data - I chose the latter, since the former one is usually less then 1mb
  3. When there are model.onnx, modelfp16.onnx, model_quantized.onnx - I chose the first one, I assumed that it would be the right thing to choose because there are no mentions of fp16 / quantized, etc. in the descriptions, and I assumed that it takes model.onnx by default.

BAAI/bge-small-en-v1.5 contained 2 sources - GCP and HF, however the models there varies in size. HF one is 2 times smaller then GCP, so I decoupled them.

intfloat/multilingual-e5-large had a wrong link, it was named differently in the cloud, I renamed it there.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@joein joein requested a review from Anush008 March 30, 2024 11:20
@NirantK
Copy link
Contributor

NirantK commented Apr 1, 2024

Wherever we've quantized models which meet the allclose tests — we'll use that as our default, without supporting the direct onnx import to avoid confusion.

We also prefer HF over URL download.

The split here would've changed the underlying default model without informing the user, so I've tried to make that consistent for all models here

cc @joein @Anush008

@NirantK NirantK merged commit 8b7b847 into main Apr 1, 2024
17 checks passed
@NirantK NirantK deleted the fix-model-info branch April 1, 2024 10:48
@@ -70,15 +61,6 @@
"dim": 384,
"description": "Fast and Default English model",
"size_in_GB": 0.13,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be adjusted since the quantized model is 2 times smaller

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants