-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model chooser screen in settings #18
Comments
File sizes (in bytes) will also be verified while downloading to prevent a waste of bandwidth from a compromised account or server. |
This will have to be delayed to avoid complications for when we switch to using Rust for running the Whisper models instead of whisper.cpp. |
This would also solve #27 as far as I see. What are the blockers for using a generic interface that would be hot-swapped when the rust engine is implemented? |
It could open up more work for the future, such as needing to make a model converter if the Rust engine uses a different format. Switching to using Rust to run the models is under high priority and I'm currently researching different Rust machine learning libraries to run Whisper with to find one that's at least close to whisper.cpp's speed. |
@soupslurpr thanks for the explaination. Although it would surprise me if a cpp-rust-adapter would ignore the established formats just for sake of being something else. That said, it's the FOSS world we're talking about here and logic doesn't always prevail. |
Whisper.cpp itself doesn't use GGUF. It uses its own custom .bin format. No idea why they didn't switch. Also, the Rust library I choose (may be burn) may use a different format and don't know how difficult it would be to convert. |
I'm having trouble finding a library in Rust that can provide sufficient speed. I think this'll have to be implemented before migrating to Rust. Refactors and rewrites need to be done first, and then a model picker can be implemented. To keep things simple, it'll probably only have a few choices at first. One of them could be a multilingual model. There can be an option to choose whether to automatically detect the language or specify a language. Testing is needed to make sure only languages which actually work with the model are exposed. The Base Q8_0 model could be the initial multilingual model. I'll have to check benchmarks and maybe ask for community feedback to determine which languages to allow choosing to use for it. Showing all languages, including ones that haven't been tested with the model at all would be harmful because it would project a false impression that Transcribro supports it, and people will be upset that it outputs gibberish with their language. |
A screen in settings to download more models from huggingface from the app itself, pick the model that will be used, and manage/delete them. The models would be downloaded from a repo from my huggingface account and the hashes of the files would be checked against hashes included with Transcribro to ensure integrity even in the event of a huggingface server compromise.
There should be a text box at the top of the screen to test the selected model using the Voice Input Keyboard.
The most recommended models would be shown first and there shouldn't be an overwhelming amount of choice for no benefit. Test different model quants to choose enough models for a sensible variety of speed vs accuracy vs multilingualism and clearly communicate those properties in the interface. If needed, there can be a "more models" button that goes to a screen with the other models to keep the list from being too long.
Additionally, there should be an option to import a model from a file which can show up below the ones downloaded from the app but in a separate section to not mistake them from the official ones.
The text was updated successfully, but these errors were encountered: