Skip to content

Commit

Permalink
Generate sounds using pre-trained RegNet
Browse files Browse the repository at this point in the history
  • Loading branch information
PeihaoChen committed Nov 23, 2020
1 parent 532f199 commit b49fc5b
Show file tree
Hide file tree
Showing 5 changed files with 2,863 additions and 1 deletion.
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
ckpt
data
.vscode
*.egg-info
.ipynb_checkpoints/
__pycache__
my_scripts/
logs/
21 changes: 20 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ Training the REGNET from scratch. The results will be saved to `ckpt/dog`.
```bash
CUDA_VISIBLE_DEVICES=7 python train.py \
save_dir ckpt/dog \
auxiliary_dim 64 \
auxiliary_dim 32 \
rgb_feature_dir data/features/dog/feature_rgb_bninception_dim1024_21.5fps \
flow_feature_dir data/features/dog/feature_flow_bninception_dim1024_21.5fps \
mel_dir data/features/dog/melspec_10s_22050hz \
Expand Down Expand Up @@ -123,6 +123,25 @@ git clone https://github.com/r9y9/wavenet_vocoder && cd wavenet_vocoder
git checkout 2092a64
```

## Pre-trained Models
You can also use our pre-trained RegNet for generating visually aligned sounds.

First, download and unzip our pre-trained RegNet ([Dog](https://github.com/PeihaoChen/regnet/releases/download/Pretrained_RegNet/RegNet_dog_checkpoint_041000.tar)) to `./ckpt/dog` folder.
```bash
tar -xvf ./ckpt/dog/RegNet_dog_checkpoint_041000.tar # unzip
```


Second, run the inference code.
```bash
CUDA_VISIBLE_DEVICES=0 python test.py \
-c config/dog_opts.yml \
aux_zero True \
checkpoint_path ckpt/dog/checkpoint_041000 \
save_dir ckpt/dog/inference_result \
wavenet_path /path/to/wavenet_dog.pth
```

Enjoy your experiments!


Expand Down
50 changes: 50 additions & 0 deletions config/dog_opts.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
D_interval: 1
audio_samples: 10
aux_zero: false
auxiliary_dim: 32
auxiliary_sample_rate: 32
auxiliary_type: lstm_last
batch_size: 8
beta1: 0.5
checkpoint_path: ''
continue_train: false
cudnn_benchmark: false
cudnn_enabled: true
decoder_conv_dim: 1024
dist_backend: nccl
dist_url: tcp://localhost:54321
dynamic_loss_scaling: true
encoder_embedding_dim: 2048
encoder_kernel_size: 5
encoder_n_convolutions: 3
encoder_n_lstm: 2
epoch_count: 0
epochs: 1000
exclude_dirs:
- ckpt
- data
flow_feature_dir: data/features/dog/feature_flow_bninception_dim1024_21.5fps
grad_clip_thresh: 1.0
lambda_Oriloss: 10000.0
lambda_Silenceloss: 0
loss_type: MSE
lr: 0.0002
mel_dir: data/features/dog/melspec_10s_22050hz
mel_samples: 860
mode_input: ''
n_mel_channels: 80
niter: 100
num_epoch_save: 10
postnet_embedding_dim: 512
postnet_kernel_size: 5
postnet_n_convolutions: 5
random_z_dim: 512
rgb_feature_dir: data/features/dog/feature_rgb_bninception_dim1024_21.5fps
save_dir: ckpt/dog
seed: 123
test_files: filelists/dog_test.txt
training_files: filelists/dog_train.txt
video_samples: 215
visual_dim: 2048
weight_decay: 1.0e-06
wo_G_GAN: false
128 changes: 128 additions & 0 deletions filelists/dog_test.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
video_02657
video_02658
video_02659
video_02660
video_02661
video_02662
video_02663
video_02664
video_02665
video_02666
video_02667
video_02668
video_02669
video_02670
video_02671
video_02672
video_02673
video_02674
video_02675
video_02676
video_02677
video_02678
video_02679
video_02680
video_02681
video_02682
video_02683
video_02684
video_02685
video_02686
video_02687
video_02688
video_02689
video_02690
video_02691
video_02692
video_02693
video_02694
video_02695
video_02696
video_02697
video_02698
video_02699
video_02700
video_02701
video_02702
video_02703
video_02704
video_02705
video_02706
video_02707
video_02708
video_02709
video_02710
video_02711
video_02712
video_02713
video_02714
video_02715
video_02716
video_02717
video_02718
video_02719
video_02720
video_02721
video_02722
video_02723
video_02724
video_02725
video_02726
video_02727
video_02728
video_02729
video_02730
video_02731
video_02732
video_02733
video_02734
video_02735
video_02736
video_02737
video_02738
video_02739
video_02740
video_02741
video_02742
video_02743
video_02744
video_02745
video_02746
video_02747
video_02748
video_02749
video_02750
video_02751
video_02752
video_02753
video_02754
video_02755
video_02756
video_02757
video_02758
video_02759
video_02760
video_02761
video_02762
video_02763
video_02764
video_02765
video_02766
video_02767
video_02768
video_02769
video_02770
video_02771
video_02772
video_02773
video_02774
video_02775
video_02776
video_02777
video_02778
video_02779
video_02780
video_02781
video_02782
video_02783
video_02784
Loading

0 comments on commit b49fc5b

Please sign in to comment.