`--diarize` flag is unreliable #216

savchenko · 2022-12-02T06:20:53Z

Windows binary is from https://github.com/ggerganov/whisper.cpp/actions/runs/3596200207 ( 061fc81 )

I have an audio of two speakers having a conversation split between left and right channels. There is no echo, audio bleed or so.

In the example below, 2nd line has sentences said by two separate speakers labelled as "speaker 1". In reality, Speaker 1 has finished with "...what the website is" and the next sentence, starting with "Because there's like..." belongs to the Speaker 0.

[00:24:18.160 --> 00:24:24.400]  (speaker 1) XXXXXXXXX can do XXXXXXXXX these things. And then also once they do machine learning stuff,
--[ this line ]--> [00:24:24.400 --> 00:24:30.800]  (speaker 1) it's basically what the website is. Because there's like our capabilities include XXXXXXXXX,
[00:24:30.800 --> 00:24:40.720]  (speaker 0) site analysis, and then installing PyTorch. Well, I do remember one thing that he has

Is there any other information you might need to localise the bug?

The text was updated successfully, but these errors were encountered:

savchenko · 2022-12-02T06:28:22Z

Waveform screenshot to check the separation:

ggerganov · 2022-12-02T18:31:14Z

Yes, this is expected.
The implemented strategy is super basic and it cannot be expected to always work reliably.
In this case it fails because a single text segment contains speech by both speakers, while the strategy assumes it will be only one speaking (#64 (comment)).

ggerganov added the duplicate This issue or pull request already exists label Dec 2, 2022

ggerganov closed this as completed Dec 2, 2022

ggerganov mentioned this issue Dec 13, 2022

whisper : mark speakers/voices (diarization) #64

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`--diarize` flag is unreliable #216

`--diarize` flag is unreliable #216

savchenko commented Dec 2, 2022 •

edited

Loading

savchenko commented Dec 2, 2022

ggerganov commented Dec 2, 2022

--diarize flag is unreliable #216

--diarize flag is unreliable #216

Comments

savchenko commented Dec 2, 2022 • edited Loading

savchenko commented Dec 2, 2022

ggerganov commented Dec 2, 2022

`--diarize` flag is unreliable #216

`--diarize` flag is unreliable #216

savchenko commented Dec 2, 2022 •

edited

Loading