Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indentation Bug in trt_server.py #25

Open
DamianB-BitFlipper opened this issue Feb 1, 2024 · 5 comments
Open

Indentation Bug in trt_server.py #25

DamianB-BitFlipper opened this issue Feb 1, 2024 · 5 comments

Comments

@DamianB-BitFlipper
Copy link
Contributor

In the file trt_server.py I suspect that the highlighted lines need to be in the same indentation level as the while loop. Otherwise, in its current form it makes no sense to me. Just shining some light on this.

@makaveli10
Copy link
Collaborator

Not really, because we want to only send one response to the client, at some point we were sending all the responses we add to the llm_queue for all updates in the current segment from whisper-live but then we decided to send only the one which corresponds to the transcription with eos=True.

That said, https://github.com/collabora/WhisperFusion/blob/main/whisper_live/trt_server.py#L340-L343 this if could be at the same level as the outer if and everything should be fine.

@DamianB-BitFlipper
Copy link
Contributor Author

Thanks for your reply! I understand the logic to only send those responses with eos. But could there not be a backlog in the llm_queue such that there are multiple sentences. Where the first one has an EOS and then begins the other with its own EOS. In the current implementation, the first sentence would be lost.

@makaveli10
Copy link
Collaborator

makaveli10 commented Feb 2, 2024

@DamianB-BitFlipper not sure i understand what you mean when you say sentences, there are llm_response which could be multiple sentences or a single word.

In the current implementation, the first sentence would be lost.

Can you please give an example if you have seen this?

@DamianB-BitFlipper
Copy link
Contributor Author

I wouldn't expect to see this in most cases in practice because the llm_response queue would empty rather quickly. I am just postulating, from exploring the code and poking at it, that the transcriber sends: [<first sentence, eos=True>, <second sentence here, eos=True>], the way the code is written, the first sentence is lost.

I am aware that the transcriber does not put eos=True at the end of sentences, but rather at prolonged pauses of non-voice input. I am using sentence here as an example purely.

@makaveli10
Copy link
Collaborator

makaveli10 commented Feb 5, 2024

I am just postulating, from exploring the code and poking at it, that the transcriber sends: [<first sentence, eos=True>, <second sentence here, eos=True>], the way the code is written, the first sentence is lost.

@DamianB-BitFlipper Okay, so we should never reach this state for a short exchange conversation i.e. we transcribe until EOS=true and at that time the llm_queue should be

[{output1, eos=False}, ..., {outputn, eos=True}]

We only care about outputn at this moment, because that is the most recent llm_output corresponding to the most updated transcription. Not sure why you would want output1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants