Skip to content

Latest commit

 

History

History
23 lines (15 loc) · 673 Bytes

README.md

File metadata and controls

23 lines (15 loc) · 673 Bytes

WaveNet vocoder

A Pytorch implementation of the WaveNet vocoder, which can generate raw speech samples conditioned on mel spectrograms. This task refers to a speech synthesis problem, when we need to reconstruct an audio signal from a mel spectrogram.

Usage

You can download my pretrained model or train your own. Settings for calculating mel spectrograms can be found here:

from config import MelSpectrogramConfig
from src.preprocessing import MelSpectrogram

featurizer = MelSpectrogram(MelSpectrogramConfig()).to(device)
mel_spectrogram = featurizer(audio_wav)

Then, prediction:

predicted_audio = model.inference(mel_spectrogram)