Releases · yxlllc/DDSP-SVC

08 Feb 09:46

yxlllc

ad13f4d

5.0: Improved DDSP Cascade Diffusion Model Latest

Latest

model_0.pt is a pre-trained model using contentvec768l12 encoder.
A demo of training from scratch (without using a pre-trained model) is here.

Assets 3

15 Aug 15:47

yxlllc

4.0

a33508d

4.0: DDSP Cascade Diffusion Model

Unzip the demo model into exp directory, unzip the sample audios to the main directory, then run the demo samples:

# opencpop (1st speaker)
python main_diff.py -i samples/source.wav -diff exp/diffusion-new-demo/model_200000.pt -o samples/svc-opencpop+12key.wav -id 1 -k 12 -kstep 100
# kiritan (2nd speaker)
python main_diff.py -i samples/source.wav -diff exp/diffusion-new-demo/model_200000.pt -o samples/svc-kiritan+12key.wav -id 2 -k 12 -kstep 100
# mix the timbre of opencpop and kiritan in a 0.5 to 0.5 ratio
python main_diff.py -i samples/source.wav -diff exp/diffusion-new-demo/model_200000.pt -o samples/svc-opencpop_kiritan_mix+12key.wav -mix "{1:0.5,2:0.5}" -k 12 -kstep 100

The training data of this 2-speaker model is from opencpop and kiritan

Thanks to CN_ChiTu for helping to train this model.

Assets 4

13 May 18:45

yxlllc

3.0

e299ac2

3.0: Dramatically improve audio quality with a shallow diffusion model

Unzip the two demo models into exp directory, then run the demo samples:

# opencpop (1st speaker)
python main_diff.py -i samples/source.wav -ddsp exp/ddsp-demo/model_300000.pt -diff exp/diffusion-demo/model_400000.pt -o samples/svc-opencpop+12key.wav -id 1 -k 12 -kstep 300
# kiritan (2nd speaker)
python main_diff.py -i samples/source.wav -ddsp exp/ddsp-demo/model_300000.pt -diff exp/diffusion-demo/model_400000.pt -o samples/svc-kiritan+12key.wav -id 2 -k 12 -kstep 300
# mix the timbre of opencpop and kiritan in a 0.5 to 0.5 ratio
python main_diff.py -i samples/source.wav -ddsp exp/ddsp-demo/model_300000.pt -diff exp/diffusion-demo/model_400000.pt -o samples/svc-opencpop_kiritan_mix+12key.wav -mix "{1:0.5,2:0.5}" -k 12 -kstep 300

The training data of this 2-speaker model is from opencpop and kiritan

Thanks to lafi2333 for helping to train the demo models.

Assets 4

21 Mar 16:46

yxlllc

2.0

0c402ac

2.0：Greatly optimized training speed

Unzip the pretrained model into exp directory, then run the demo samples:

# opencpop (1st speaker)
python main.py -i samples/source.wav -m exp/multi_speaker/model_300000.pt -o samples/svc-opencpop+12key.wav -k 12 -id 1
# kiritan (2nd speaker)
python main.py -i samples/source.wav -m exp/multi_speaker/model_300000.pt -o samples/svc-kiritan+12key.wav -k 12 -id 2
# mix the timbre of opencpop and kiritan in a 0.5 to 0.5 ratio
python main.py -i samples/source.wav -m exp/multi_speaker/model_300000.pt -o samples/svc-opencpop_kiritan_mix+12key.wav -k 12 -mix "{1:0.5, 2:0.5}"

The training data of this 2-speaker model is from opencpop and kiritan

Thanks to CN_ChiTu for helping to train this model.

Assets 3

08 Mar 14:02

yxlllc

1.1

0278183

Multi-speaker support and timbre mixing

Unzip the pretrained model into exp directory, then run the demo samples:

# opencpop (1st speaker)
python main.py -i samples/source.wav -m exp/multi_speaker/model_300000.pt -o samples/svc-opencpop+12key.wav -k 12 -pe crepe -e true -id 1
# kiritan (2nd speaker)
python main.py -i samples/source.wav -m exp/multi_speaker/model_300000.pt -o samples/svc-kiritan+12key.wav -k 12 -pe crepe -e true -id 2
# mix the timbre of opencpop and kiritan in a 0.5 to 0.5 ratio
python main.py -i samples/source.wav -m exp/multi_speaker/model_300000.pt -o samples/svc-opencpop_kiritan_mix+12key.wav -k 12 -pe crepe -e true -mix "{1:0.5, 2:0.5}"

The training data of this 2-speaker model is from opencpop and kiritan

Thanks to CN_ChiTu for helping to train this model.

Assets 3

05 Mar 03:08

yxlllc

1.0

7e59a9e

1.0

Unzip the pretrained model into exp directory, then run the demo samples:

# origin output
python main.py -i samples/source.wav -m exp/opencpop/model_300000.pt -o samples/svc-opencpop+10key-origin.wav -k 10 -pe crepe
# enhanced output
python main.py -i samples/source.wav -m exp/opencpop/model_300000.pt -o samples/svc-opencpop+10key-enhance.wav -k 10 -pe crepe -e true

The training data is from opencpop

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: yxlllc/DDSP-SVC

5.0: Improved DDSP Cascade Diffusion Model

4.0: DDSP Cascade Diffusion Model

3.0: Dramatically improve audio quality with a shallow diffusion model

2.0：Greatly optimized training speed

Multi-speaker support and timbre mixing

1.0