Speaker Diarization and Targeted Voice Separation

This project performs speaker diarization and targeted voice separation using the pyannote.audio library. It identifies and processes specific speaker segments from audio files, combining them into a single output file.

Features

Diarizes audio to detect different speakers.
Filters speaker segments based on duration and speaker matching.
Merges and saves targeted audio segments into a single output file.

Setup Instructions

1. Clone the Repository

git clone https://github.com/Chiyan200/targeted-voice-separation.git cd targeted-voice-separation.git

### **2. Install Dependencies**
Install the required Python packages from the requirements.txt file

### **3. Set Up Environment Variables**

Create a .env file in the root directory and add your Hugging Face token:
HF_TOKEN=your_hugging_face_token

### **4. Prepare Audio Files**
Place the dataset audio files in the audioData/dataset folder.
Provide the target voice file as target_voice_path.

### **5. Run the Script**
Place the dataset audio files in the audioData/dataset folder.
Provide the target voice file as target_voice_path.

python script_name.py

### **5. Project Structure**
.
├── audioData/
│   ├── dataset/                 # Folder for input audio files
│   ├── segment/                 # Folder for processed audio output
├── json/                        # Folder for diarization JSON outputs
├── script_name.py               # Main Python script
├── requirements.txt             # Python dependencies
├── .env                         # Environment variables
└── README.md                    # Documentation

Functions
    Diarization
    Performs speaker diarization and filters speaker data based on minimum duration.

predictSpeaker
    Filters the diarization data to include segments for the targeted speaker.

process_and_save_audio
    Cuts and merges specific audio segments into a single waveform based on diarization results.

Output
    A JSON file containing diarization results is saved in the json/ folder.
    A merged WAV file containing targeted audio segments is saved in the specified output path.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
audioData/dataset		audioData/dataset
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
speakerDiarization.py		speakerDiarization.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speaker Diarization and Targeted Voice Separation

Features

Setup Instructions

1. Clone the Repository

About

Releases

Packages

Languages

Chiyan200/targeted-voice-separation

Folders and files

Latest commit

History

Repository files navigation

Speaker Diarization and Targeted Voice Separation

Features

Setup Instructions

1. Clone the Repository

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages