-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
whisper-turbo support #2439
Comments
I created the GGML with these commands and benchmark results using my desktop with Nvidia GPU:
|
@kth8 nice results! Did you put the ggml online somewhere? Would save me some time converting 🙂 |
maybe share the GGML? |
I haven't but here is also my benchmarks for unquantized models:
|
It has been uploaded to the HG repo: https://huggingface.co/ggerganov/whisper.cpp/blob/main/ggml-large-v3-turbo.bin |
great, thanks!!! |
But when I try using
|
I've generated a CoreML version of Whisper-Turbo for Mac users. You can find more information about it here: I've included a brief performance comparison for converting the samples/jfk.wav file: Metal:
Metal & CoreML (first run):
Metal & CoreML (second run):
After updating the OS from Ventura to Sonoma, I ran the measurements again and found that the results were quite different. I've included the execution logs below for reference. Metal:
Metal & CoreML (first run):
Metal & CoreML (second run):
|
On base M1 MBA running macOS Sequoia:
but If I disable GPU the results are faster?
|
Is a whisper-medium-turbo.bin model planned for release? |
Did you find a solution to this @kth8 ? |
@spoeken no, I just went back to using the small model for streaming. |
OpenAI released Whisper-Turbo, which is a drop-in replacement for large, multilingual, 8x faster, less memory, but minimal degradation in performance.
https://github.com/openai/whisper
The text was updated successfully, but these errors were encountered: