-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error for training landsacoe dataset #6
Comments
Hi, our pre-trained models do not include training with LoRA, thus, I have not encountered this error. Try using the config file to disable LoRA during training (train the adapter only). |
Thank you! I still don't quite understand the adapter you mentioned. Regarding the landscape and audioset-drum datasets, would you mind telling me which training modules should be set to True in the config file for training? |
Be sure to set use_unet_lora to False in the config file to disable LoRA training: |
Thank you for your patient reply and excellent work! I encountered several errors while running your training code, including issues with default parameter settings and dataset input. I'm not sure if it's because I don't fully understand the code framework or if there are some issues with the code. Like this error: File "train.py", line 1056, in main Could you possibly provide a final version of the code you used for training on the audioset-drum or landscape datasets (including the config file)? I would be very grateful! |
It looks like you didn't load the datasets as required; they should be split into an audio folder and a video folder. It appears you tried to load them from an empty sequence. The current code should run without any issues. Feel free to reach out to me via email at guyyariv.mail at gmail dot com |
Following your suggestions, I attempted to replicate the process and conducted experiments on the Landscape and Audioset-Drum datasets (I did not change the provided config files). However, my results have been less than satisfactory. Below are the changes I made to the code: The audio data is stereo, so the input dimension is [2, 16000]. I noticed that in your dataset code, the audio input dimension is set as [1, 16000], so I performed a simple average on the first dimension. |
Hi, I'm not sure why you cannot reconstruct the Landscape and Audioset-Drum results. These are both easy datasets (less challenging than VGGSound, for example), and the model should converge in high quality and quickly when using them. I used the provided version of those datasets (as mentioned in the README, for example, https://drive.google.com/drive/folders/14A1zaQI5EfShlv3QirgCGeNFzZBzQ3lq is Landscape) and split them into video alone and audio alone (mono), then used the provided config file. Please ask ChatGPT to split them for you into two different folders. Then, try to train again. Let me know if it is improved now. |
Hello, thank you for your patient reply! I still have a few questions regarding the code implementation that I would like to confirm with you: 1、The original video sizes for the three datasets—landscape, audioset-drum, and vggsound—are different (landscape is 288x512, audioset-drum is 96x96, vggsound is 360x212). However, you have set the video size for training and inference in the three config files as 384x384 and used a bucketing strategy. Should I change this parameter, or should I follow your setting and standardize it to 384x384? I look forward to your reply! |
When I run the training code on the landscape dataset, I encounter an error. How should I solve it?
LoRA rank 16 is too large. setting to: 4
Traceback (most recent call last):
File "train.py", line 1221, in
main(**config)
File "train.py", line 770, in main
unet_lora_params, unet_negation = inject_lora(
File "train.py", line 293, in inject_lora
params, negation = injector(**injector_args)
File "/home/TempoTokens/utils/lora.py", line 461, in inject_trainable_lora_extended
_tmp.to(_child_module.bias.device).to(_child_module.bias.dtype)
AttributeError: 'NoneType' object has no attribute 'device'
Thank you for your answer!
The text was updated successfully, but these errors were encountered: