Generalized Deep Multiset Canonical Correlation Analysis for Multiview Learning of Speech Representations
-
Updated
Apr 9, 2019 - Python
Generalized Deep Multiset Canonical Correlation Analysis for Multiview Learning of Speech Representations
Fine-tuning wav2vec2 to for Pathological Speech Processing
DNN embeddings extraction from audio and speech recordings using PyTorch.
Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity
The Dis-Vector project enhances voice conversion and synthesis through disentangled embeddings, allowing for high-quality, zero-shot voice cloning across multiple languages. This model leverages separate encoders for content, pitch, rhythm, and timbre, enabling precise control over synthesized voice characteristics.
This repository belongs to my Bachelor's thesis on predicting voice likability from pre-trained speech embeddings.
Add a description, image, and links to the speech-embeddings topic page so that developers can more easily learn about it.
To associate your repository with the speech-embeddings topic, visit your repo's landing page and select "manage topics."