Problem with generated audio from pre-trained checkpoints #11

swamiviv · 2021-01-24T23:24:08Z

I used the pretrained checkpoints (64md_8k) for the sc09 dataset and generated samples as recommended. I used the following to read it and listen:

fname = 'commands_listen.mat'
mat = scipy.io.loadmat(fname)
import IPython.display as ipd
sr = 22050 # sample rate
ipd.Audio(mat['reconstructed'][0, :], rate=sr) # play a NumPy array

I find that most samples are illegible, but I can find some sounds here and there. Is that normal?
Out of curiosity, are the examples you present in the website cherry-picked?
Am I doing something wrong in generating the samples?

The text was updated successfully, but these errors were encountered:

andimarafioti · 2021-04-13T14:42:05Z

I don't think that is normal
Not cherry picked, it's an array of sounds, you might notice some sound better than others.
Most likely you are doing something wrong, can you give more details?

andimarafioti · 2021-04-13T14:48:54Z

Are you using the speech checkpoint? there are a few checkpoints for piano too

andimarafioti · 2021-04-13T14:50:54Z

Could it be that you are indexing the matrix wrong? So mat['reconstructed'][:, 0] instead of mat['reconstructed'][0, :]

andimarafioti closed this as completed Apr 13, 2021

andimarafioti reopened this Apr 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with generated audio from pre-trained checkpoints #11

Problem with generated audio from pre-trained checkpoints #11

swamiviv commented Jan 24, 2021

andimarafioti commented Apr 13, 2021

andimarafioti commented Apr 13, 2021

andimarafioti commented Apr 13, 2021

Problem with generated audio from pre-trained checkpoints #11

Problem with generated audio from pre-trained checkpoints #11

Comments

swamiviv commented Jan 24, 2021

andimarafioti commented Apr 13, 2021

andimarafioti commented Apr 13, 2021

andimarafioti commented Apr 13, 2021