mexca-sd-experiment

A repository for comparing potential speaker diarization tools to be used in the MEXCA pipeline.

Structure

The repository contains subdirectories for different parts of the experiment:

speaker-diarization\: Contains all files for the speaker diarization part
- embeddings\: Contains the encoded speaker embeddings as .pt files
- results\: Contains the .rttm files with speaker annotations
- clustering.py: Script for clustering the speaker embeddings and assigning the speaker labels to speaker segments
- sd_*.py: Scripts for applying the respective speaker encoding models
- compare_sd.ipynb: Notebook for comparing the speaker diarization approaches
- speaker_diarization.py: Script to run all speaker encoding scripts after each other
- speaker_representation.py: Helper functions for performing speaker diarization
voice-activity-detection\: Contains all files for the voice activity detection part
- results\: Contains the .rttm files with speech segments
- compare_vad.ipynb: Notebook for comparing the voice activity detection approaches
- custom.conf: Configuration file for the opensmile feature extractor
- opensmile_helper_functions: Helper functions for extracting opensmile voice activity features
- vad_*.py: Scripts for applying the voice activity detection models
explore_ami_corpus.ipynb: Notebook for exploring the properties of the AMI corpus
rttm.py: Functions for creating, reading, modifying, and writing .rttm files and objects
rttm_test.py: Preliminary test suite for rttm.py

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
automatic-speech-recognition		automatic-speech-recognition
speaker-diarization		speaker-diarization
speaker-segmentation		speaker-segmentation
voice-activity-detection		voice-activity-detection
LICENSE		LICENSE
README.md		README.md
create_ded21_corpus.ipynb		create_ded21_corpus.ipynb
custom_datasets.py		custom_datasets.py
explore_ami_corpus.ipynb		explore_ami_corpus.ipynb
rttm.py		rttm.py
test_rttm.py		test_rttm.py