Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batching multi-client server #42

Open
Gldkslfmsd opened this issue Dec 8, 2023 · 3 comments
Open

batching multi-client server #42

Gldkslfmsd opened this issue Dec 8, 2023 · 3 comments

Comments

@Gldkslfmsd
Copy link
Collaborator

          > > How to use this to allow multiple clients to connect when you host a server or create an API for live transcription?

I don't know, it's a topic that requires a separate issue. But first, there must be a Whisper backend that enables batching -- more inputs processing at once. If there's not, then use one GPU with one server for one client.

Thank you. Using one GPU for each client is a tall ask for me as there could be up to a dozen clients active at a particular time for my use case. I think there are a few backends which do support batched processing. e.g. https://github.com/Blair-Johnson/batch-whisper
If you have any references or you can point me to the parts where changes are needed to implement this.
Or is it alright if I create a new issue for this?

Originally posted by @umaryasin33 in #10 (comment)

@Gldkslfmsd
Copy link
Collaborator Author

I also found this fast batching whisper backend: https://github.com/Vaibhavs10/insanely-fast-whisper

@Gldkslfmsd
Copy link
Collaborator Author

Gldkslfmsd commented Dec 8, 2023

you can point me to the parts where changes are needed to implement this.

First, you need a multi-client server. It handles each client the same way as single client, but it needs a new subclass of ASRBase that would connect through API to a batching backend. Maybe the API could be shared with #34 ?

And then you need the Whisper batching backend and API -- I don't know which way is optimal, a subprocess, network API, etc.

From code policy point of view, make a new entry point for the multi-client server. I suggest a separate project which would use Whisper-Streaming as a module. I could not be available to maintain it in this repo.

@Gldkslfmsd
Copy link
Collaborator Author

but more projects could use this feature, like https://github.com/ufal/correctable-lecture-translator . Open-sourcing and collaboration is welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant