TTS without Text? #18

ErfolgreichCharismatisch · 2021-01-23T14:27:28Z

As I understand it, this tts-algorithm works with your audio files without assigned text.

How would it understand the content, language?
Is it working with the lj-speech set only or a dataset in lj-speech structure?

ivanvovk · 2021-01-23T14:57:20Z

@ErfolgreichCharismatisch modern TTS models consist of 2 parts: feature generator and vocoder. Feature generator produces low-dimensional time-frequency acoustic features from text, while vocoder reconstructs raw waveform from these features. Each model trains separately. WaveGrad corresponds to the second part, vocoder. It takes acoustic features (mel-spectrograms) as input, not text. And it can be trained on arbitrary dataset.

ErfolgreichCharismatisch · 2021-01-23T15:35:49Z

Interesting. So which feature generator(s) does it work out of the box with?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TTS without Text? #18

TTS without Text? #18

ErfolgreichCharismatisch commented Jan 23, 2021

ivanvovk commented Jan 23, 2021 •

edited

Loading

ErfolgreichCharismatisch commented Jan 23, 2021

TTS without Text? #18

TTS without Text? #18

Comments

ErfolgreichCharismatisch commented Jan 23, 2021

ivanvovk commented Jan 23, 2021 • edited Loading

ErfolgreichCharismatisch commented Jan 23, 2021

ivanvovk commented Jan 23, 2021 •

edited

Loading