Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to train model on SPEECHCOMMANDS dataset #1

Open
jayathungek opened this issue Apr 22, 2022 · 0 comments
Open

Unable to train model on SPEECHCOMMANDS dataset #1

jayathungek opened this issue Apr 22, 2022 · 0 comments

Comments

@jayathungek
Copy link

I've downloaded the speech_commands_v0.02 tar file and extracted it into the following directory structure:

data/SCNUMBERS1024
└── SpeechCommands
    └── speech_commands_v0.02
        ├── _background_noise_
        ├── backward
        ├── bed
        ├── bird
        ├── cat
        ├── dog
        ├── down
        ├── eight
        ├── five
        ├── follow
        ├── forward
        ├── four
        ├── go
        ├── happy
        ├── house
        ├── learn
        ├── left
        ├── marvin
        ├── nine
        ├── no
        ├── off
        ├── on
        ├── one
        ├── right
        ├── seven
        ├── sheila
        ├── six
        ├── stop
        ├── three
        ├── tree
        ├── two
        ├── up
        ├── visual
        ├── wow
        ├── yes
        └── zero

I then try to train the model on this dataset via:

$ python train.py --wandb 0 --architecture pi-gan_wide --dataset_name SPEECHCOMMANDS --dataset_size 128

but run into a NoneType error, which leads me to believe than the dataset is not initialised properly somehow. the full output of running the above command is below:

~/.virtualenvs/pcinr/lib/python3.8/site-packages/torchaudio/backend/utils.py:53: UserWarning: "sox" backend is being deprecated. The default backend will be changed to "sox_io" backend in 0.8.0 and "sox" backend will be removed in 0.9.0. Please migrate to "sox_io" backend. Please refer to https://github.com/pytorch/audio/issues/903 for the detail.
  warnings.warn(
1
{   'architecture': 'pi-gan_wide',
    'audio_length': 16000,
    'autoconfig': 0,
    'batch_size': 128,
    'cdpam': 0,
    'coord_multi': 1,
    'dataset_name': 'SPEECHCOMMANDS',
    'dataset_size': 128,
    'deriv_per_sample': 1,
    'double': 0,
    'eval_every': 5000,
    'eval_samples': 1,
    'eval_upscale_ratio': 1,
    'first_omega_0': 3000,
    'hidden_omega_0': 30,
    'input_dim': 1,
    'latent_descent_steps': 1,
    'latent_init_std': 0.001,
    'latent_lr': 0.3,
    'lr': 1e-05,
    'max_high_res_batch_size': 16,
    'meta_architecture': 'autodecoder',
    'multiscale_STFT': 0,
    'note': 'default',
    'note_general': 'default',
    'num_epochs': 10001,
    'num_groups': 0,
    'num_latent': 256,
    'output_dim': 1,
    'per_sample': 1,
    'prog_weight_decay_every': 0,
    'prog_weight_decay_factor': 0,
    'sample_even': 1,
    'samples_per_datapoint': 2000,
    'save_audio': 1,
    'save_audio_plots': 0,
    'save_latents': 1,
    'save_model': 1,
    'save_path': 'results/default/SPEECHCOMMANDS/pi-gan_wide/autodecoder',
    'use_gpu': 1,
    'use_multi_gpu': 0,
    'wandb': 0,
    'wandb_project_name': 'neurips',
    'weight_decay': 0,
    'weight_norm': 0}
activations: ['sine', 'sine', 'none']
init_methods: [{'weights': 'siren_first', 'bias': 'polar'}, {'weights': 'siren', 'bias': 'polar'}, {'weights': 'siren_omega', 'omega': 30, 'bias': 'none'}]
layer 0: Film conditioned
layer 1: Film conditioned
layer 2: Film conditioned
layer 3: Film conditioned
piGAN_custom(
  (film_mapping_net): PiGANMappingNetwork(
    (net): Sequential(
      (0): Linear(in_features=256, out_features=256, bias=True)
      (1): LeakyReLU(negative_slope=0.2, inplace=True)
      (2): Linear(in_features=256, out_features=256, bias=True)
      (3): LeakyReLU(negative_slope=0.2, inplace=True)
      (4): Linear(in_features=256, out_features=256, bias=True)
      (5): LeakyReLU(negative_slope=0.2, inplace=True)
      (6): Linear(in_features=256, out_features=730, bias=True)
    )
  )
  (net): Sequential(
    (0): ImplicitMLPLayer(
      (linear): Linear(in_features=1, out_features=365, bias=True)
    )
    (1): ImplicitMLPLayer(
      (linear): Linear(in_features=365, out_features=365, bias=True)
    )
    (2): ImplicitMLPLayer(
      (linear): Linear(in_features=365, out_features=365, bias=True)
    )
    (3): ImplicitMLPLayer(
      (linear): Linear(in_features=365, out_features=365, bias=True)
    )
    (4): ImplicitMLPLayer(
      (linear): Linear(in_features=365, out_features=1, bias=True)
    )
  )
)
Number of parameters: 786852
Random Seed:  0
~/Desktop/phd/continuous-audio-representations/objective.py:11: UserWarning: torch.range is deprecated and will be removed in a future release because its behavior is inconsistent with Python's range builtin. Instead, use torch.arange, which produces values in [start, end).
  self.finite_diff_derivative = torch.range(-1,1,2).unsqueeze(0).unsqueeze(0).to(device)
Seeing  1 GPUs
Starting run for 10001 epochs..
Traceback (most recent call last):
  File "train.py", line 358, in <module>
    train(model, optim_INR, optim_mapping, scheduler, train_loader, config)
  File "train.py", line 80, in train
    g = model(sampled_coords, z=z)
  File "/home/kavi/.virtualenvs/pcinr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kavi/.virtualenvs/pcinr/lib/python3.8/site-packages/INR_collection/modules.py", line 487, in forward
    concat = concat.repeat(1, coordinates.shape[1], 1)
AttributeError: 'NoneType' object has no attribute 'repeat'

Any idea why this might be? Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant