You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wanted to improve my segmentation with Pyannote, since most of my segments are very long when the same person is talking. Since min_duration_off is already set to 0.0, I looked through the code and found the classes Resegmentation and AdaptiveVoiceActivityDetection.
I thought by applying one of those methods I would be able to get shorter segments, however it seems the code is not working. Is this legacy code or should it work?
For AdaptiveVoiceActivityDetection I get the error:
File "/home/.../PycharmProjects/..../venv/lib/python3.10/site-packages/pyannote/audio/pipelines/voice_activity_detection.py", line 313, in apply
vad_pipeline = VoiceActivityDetection("vad").instantiate(
File "/home/.../PycharmProjects/..../venv/lib/python3.10/site-packages/pyannote/audio/pipelines/voice_activity_detection.py", line 123, in __init__
model = get_model(segmentation, use_auth_token=use_auth_token)
File "/home/.../PycharmProjects/.../venv/lib/python3.10/site-packages/pyannote/audio/pipelines/utils/getter.py", line 89, in get_model
model.eval()
AttributeError: 'NoneType' object has no attribute 'eval'
Could not download 'vad' model.
For me it seems to be like the instantiation is hardcoded (line 313) and the model key "vad" can not be found?
For Resegmentation I get the error, however I can not see what is wrong my way of instantion since it works, for example in case of SpeakerDiarization:
File "/home/.../PycharmProjects/.../pyannote_service.py", line 125, in diarize
reseg = self.resegmentation_model(file=tensor_audio_mapping, diarization=diarization)
File "/home/.../PycharmProjects/.../venv/lib/python3.10/site-packages/pyannote/audio/core/pipeline.py", line 304, in __call__
raise RuntimeError(
RuntimeError: A pipeline must be instantiated with `pipeline.instantiate(parameters)` before it can be applied.
I initialize the model like this:
self.resegmentation_model = Resegmentation(segmentation=MODEL_PATH_SEG)
self.resegmentation_model.instantiate(co["params"])
self.resegmentation_model.to(self.device)
# diarization is the diarization object produced by a speaker_d pipeline from pyannote
reseg = self.resegmentation_model(file=tensor_audio_mapping, diarization=diarization)
Okay for the Resegmentation pipeline it seems to be that it does not work with pyannote/segmentation-3.0. But it does work with pyannote/segmentation, which unfortunately gives me a a few warnings:
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Tested versions
System information
Ubuntu 22.04, Lenovo P1 Gen 5 Workstation A4500
Issue description
I wanted to improve my segmentation with Pyannote, since most of my segments are very long when the same person is talking. Since
min_duration_off
is already set to0.0
, I looked through the code and found the classesResegmentation
andAdaptiveVoiceActivityDetection
.I thought by applying one of those methods I would be able to get shorter segments, however it seems the code is not working. Is this legacy code or should it work?
For
AdaptiveVoiceActivityDetection
I get the error:I initialize the model like this:
For me it seems to be like the instantiation is hardcoded (line 313) and the model key "vad" can not be found?
For Resegmentation I get the error, however I can not see what is wrong my way of instantion since it works, for example in case of SpeakerDiarization:
I initialize the model like this:
where
co
refers to a yaml which looks like this:Would appreciate any insights I might have missed out on or just a short clarification if the code is not intended for usage.
Minimal reproduction example (MRE)
Can be found in my example above
The text was updated successfully, but these errors were encountered: