The timeline is wrong #1737

Lixi20 · 2024-07-09T09:30:50Z

Tested versions

pyannote.audio = 3.3.1

System information

ubuntu

Issue description

from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained(
"pyannote/speaker-diarization-3.1",
use_auth_token="hf_KkqHxRTGcaXXXXXXXsZvlMCDgAmBuSGCmXE")

import torch
pipeline.to(torch.device("cuda"))

diarization = pipeline("/root/Audio/Test.mp3")

for turn, , speaker in diarization.itertracks(yield_label=True):
print(f"start={turn.start:.1f}s stop={turn.end:.1f}s speaker{speaker}")

start=0.6s stop=2.2s speaker_SPEAKER_00
start=3.5s stop=4.0s speaker_SPEAKER_00

start=0.6s stop=2.2s -> 00:00:00,600 --> 00:00:02,200
start=3.5s stop=4.0s -> 00:00:03,500 --> 00:00:04,000
The timeline is wrong

The right time is：
00:00:02,600 --> 00:00:04,486
00:00:05,439 --> 00:00:06,013

please help me!！！

FrenchKrab · 2024-07-11T08:16:15Z

It is not clear where the problem really is, maybe you could fix the formatting...

If you mean the pipeline segments are wrong/misplaced, it might be due to lots of factors that makes it very hard for the pretrained pipeline to perform well out-of-the-box : noisy audio, specific acoustic conditions that were not seen when the model was trained, etc
You might want to finetune the model on the type of data you target (and take a look at the available tutorial notebooks).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The timeline is wrong #1737

The timeline is wrong #1737

Lixi20 commented Jul 9, 2024 •

edited

Loading

FrenchKrab commented Jul 11, 2024

The timeline is wrong #1737

The timeline is wrong #1737

Comments

Lixi20 commented Jul 9, 2024 • edited Loading

Tested versions

System information

Issue description

FrenchKrab commented Jul 11, 2024

Lixi20 commented Jul 9, 2024 •

edited

Loading