Canetis

Canetis is a recursive forced aligner built on Gentle. It gives, for each word in the transcript of an audio file, the timestamp at which it was said in the audio.

There are many other excellent forced aligners out there, including Gentle, which Canetis is built on. However, on particularly long and/or noisy audio files, small errors can accumulate within standard forced aligners, leading to lower alignment rates. In order to resolve this issue, our aligner implements the recursive algorithm described by Moreno et al. in the paper “A Recursive Algorithm for the Forced Alignment of Very Long Audio Segments”. We have found Canetis's performance to be noticeably improved compared to standard aligners such as Gentle, and we hope you find it useful!

Installation Process

Dependencies

Python 2.7 (Canetis will not work with Python 3)
Pip (which is linked to the Python 2.7 installation)
Git

Install

Clone the source onto your machine. cd into the canetis directory and run the following:

./install.sh
source ~/.bashrc

This will require sudo access. This will install all required dependencies, install Canetis, and perform required configuration.

Verify Your Installation

Run the following command:

python2 test/test.py

Shortly, it should output test alignment results for both Gentle and Canetis.

Usage

python2 align.py audio.wav transcript.txt output.txt

Puts a JSONified dictionary into the output.txt file, containing the following keys:

"start" - the start audio time
"end" - the end audio time
"word" - the word
"success" - whether the word was successfully aligned or not

Results

We tested Canetis on a set of forensic interview audio/transcripts collected by the USC Gould School of Law. On average, Canetis aligned 11.7% more words than Gentle, which we would consider to be a significant improvement.

Questions?

If you have questions or something doesn't work the way you expect, please let us know! We are always looking to make this better. The best way to reach out is by creating a GitHub issue -- we'll be able to see it and respond promptly.

Contributors

This project was created by Nihar Sheth and Kian Ghodoussi.

Name		Name	Last commit message	Last commit date
Latest commit History 169 Commits
gentle @ da7713e		gentle @ da7713e
test		test
.gitignore		.gitignore
.gitmodules		.gitmodules
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
align.py		align.py
install.sh		install.sh
segment.py		segment.py
segmentizer.py		segmentizer.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Canetis

Installation Process

Usage

Results

Questions?

Contributors

About

Releases

Packages

Languages

License

bradmurray-dt/canetis

Folders and files

Latest commit

History

Repository files navigation

Canetis

Installation Process

Usage

Results

Questions?

Contributors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages