We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider this model from Xenova, there is a quantized model which is 120M instead of the 440-450M which I get from O3 quantization from Optimum.
Compare if the quantized model is as good as the 450M, O3 with an atol of 1e-3 and an O2 of 1e-4 — or there is something else happening there?
The text was updated successfully, but these errors were encountered:
See Static & Dynamic quantization here: https://huggingface.co/docs/optimum/onnxruntime/usage_guides/quantization
Sorry, something went wrong.
From @xenova, this script traverses the graph and collects operators for quantization https://github.com/xenova/transformers.js/blob/main/scripts/convert.py
NirantK
No branches or pull requests
Consider this model from Xenova, there is a quantized model which is 120M instead of the 440-450M which I get from O3 quantization from Optimum.
Compare if the quantized model is as good as the 450M, O3 with an atol of 1e-3 and an O2 of 1e-4 — or there is something else happening there?
The text was updated successfully, but these errors were encountered: