Removing overlap for better results #733

ryomerz · 2021-08-16T10:44:47Z

ryomerz
Aug 16, 2021

Hi, thank you for creating an amazing repo. I have used this model and gotten lots of good feedback from users. I have two questions below.

We tend to overestimate the number of speakers if the length of audio is long (ex. 2 hours). Do you think this is because the longer an audio file is, the more we get overlapped segments? I thought the increase in the overlapped segments could form a new cluster and are considered as a new speaker by the clustering process. I wanted to have your view on this.
If the hypothesis above is true, we can presume that we should remove the overlapped segments using the pyannote overlap detection in the pipeline. Thus, we are trying to modify the pipeline of pyannote 1.1, and add the overlap detection in it. But I believe this is kind of what you intend to do with pyannote 2.0. I checked the pyannote 2.0 pipeline but it seems like the overlap-detection/resegmentation are not included yet. Would you be able to share when it will be added to the pyannote 2.0?

Thank you in advance!

hbredin · 2021-08-30T15:15:18Z

hbredin
Aug 30, 2021
Maintainer

Hi, thank you for creating an amazing repo. I have used this model and gotten lots of good feedback from users.

I don't often get feedback from actual users so I would love to know more about your actual use case (either here or via email if you want to keep it private).

1. We tend to overestimate the number of speakers if the length of audio is long (ex. 2 hours). Do you think this is because the longer an audio file is, the more we get overlapped segments? I thought the increase in the overlapped segments could form a new cluster and are considered as a new speaker by the clustering process. I wanted to have your view on this.

You'd have to tell me a bit more about your actual audio.

The length of your audio might be one reason as hyper-parameters of the pretrained pipeline (dia, I presume?) have been tuned on much shorter audio (5 to 10 minutes or so).

2. If the hypothesis above is true, we can presume that we should remove the overlapped segments using the pyannote overlap detection in the pipeline. Thus, we are trying to modify the pipeline of pyannote 1.1, and add the overlap detection in it. But I believe this is kind of what you intend to do with pyannote 2.0. I checked the pyannote 2.0 pipeline but it seems like the overlap-detection/resegmentation are not included yet. Would you be able to share when it will be added to the pyannote 2.0?

Removing overlapped speech before clustering would definitely help as you'd only compare pure (as in "just one speaker") speech segments.

Also, you are right that pyannote 2.0 will eventually contain an overlap-aware speaker diarization pipeline that should improve things a bit but I cannot provide any ETA. Feel free to sponsor the project to make things faster ;-)

Finally, you might be interested in that project that will eventually be merged into pyannote 2.0

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removing overlap for better results #733

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Removing overlap for better results #733

ryomerz Aug 16, 2021

Replies: 1 comment

hbredin Aug 30, 2021 Maintainer

ryomerz
Aug 16, 2021

hbredin
Aug 30, 2021
Maintainer