- Segmentation using existing verse timestamps (Sec 4.1.1)
- Forced alignment using pre-trained acoustic models (Sec 4.1.2)
- Forced alignment from scratch (Sec 4.1.3)
- Data-checker code for outlier detection (Sec 4.2)
- VITS TTS models were trained with coqui-ai (Sec 5)