Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diffusers example train_text_to_image_lora.py broken gradients? #6277

Closed
jaymefosa opened this issue Dec 21, 2023 · 3 comments
Closed

Diffusers example train_text_to_image_lora.py broken gradients? #6277

jaymefosa opened this issue Dec 21, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@jaymefosa
Copy link

Describe the bug

Gradient won't backprop when running the example lora training.

Reproduction

running the command as specified:

export MODEL_NAME="CompVis/stable-diffusion-v1-4"
export DATASET_NAME="lambdalabs/pokemon-blip-captions"

accelerate launch --mixed_precision="fp16" train_text_to_image_lora.py
--pretrained_model_name_or_path=$MODEL_NAME
--dataset_name=$DATASET_NAME --caption_column="text"
--resolution=512 --random_flip
--train_batch_size=1
--num_train_epochs=100 --checkpointing_steps=5000
--learning_rate=1e-04 --lr_scheduler="constant" --lr_warmup_steps=0
--seed=42
--output_dir="sd-pokemon-model-lora"
--validation_prompt="cute dragon creature"

Logs

File "train_text_to_image_lora.py", line 950, in <module>
    main()
  File "train_text_to_image_lora.py", line 801, in main
    accelerator.backward(loss)
  File "/home/fsa/anaconda3/envs/manimate/lib/python3.8/site-packages/accelerate/accelerator.py", line 1903, in backward
    self.scaler.scale(loss).backward(**kwargs)
  File "/home/fsa/anaconda3/envs/manimate/lib/python3.8/site-packages/torch/_tensor.py", line 492, in backward
    torch.autograd.backward(
  File "/home/fsa/anaconda3/envs/manimate/lib/python3.8/site-packages/torch/autograd/__init__.py", line 251, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

System Info

torch 2.1.2
transformers 4.36.2
peft 0.7.1
diffusers 0.24.0
accelerate 0.25.0

Who can help?

@sayakpaul

@jaymefosa jaymefosa added the bug Something isn't working label Dec 21, 2023
@jaymefosa
Copy link
Author

Just to add a detail, the non-lora version of the training (from the same examples.md page) trains fine

@sayakpaul
Copy link
Member

We recently swtiched to peft for training LoRAs. Could you please refer to the latest version of this script from the main? Also bear #6225 in mind.

@jaymefosa
Copy link
Author

Oh my sincere apologies! I'm checked out to HEAD detached at v0.21.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants