You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I used the toy dog dataset (5 images) for fine-tuning.
I ran into a problem with max_position_embeddings for CLIPTextModel:
Reproduction
[rank1]: Traceback (most recent call last):
[rank1]: File "/data/AIGC/diffusers/examples/dreambooth/train_dreambooth_flux.py", line 1812, in
[rank1]: main(args)
[rank1]: File "/data/AIGC/diffusers/examples/dreambooth/train_dreambooth_flux.py", line 1351, in main
[rank1]: instance_prompt_hidden_states, instance_pooled_prompt_embeds, instance_text_ids = compute_text_embeddings(
[rank1]: File "/data/AIGC/diffusers/examples/dreambooth/train_dreambooth_flux.py", line 1339, in compute_text_embeddings
[rank1]: prompt_embeds, pooled_prompt_embeds, text_ids = encode_prompt(
[rank1]: File "/data/AIGC/diffusers/examples/dreambooth/train_dreambooth_flux.py", line 963, in encode_prompt
[rank1]: pooled_prompt_embeds = _encode_prompt_with_clip(
[rank1]: File "/data/AIGC/diffusers/examples/dreambooth/train_dreambooth_flux.py", line 937, in _encode_prompt_with_clip
[rank1]: prompt_embeds = text_encoder(text_input_ids.to(device), output_hidden_states=False)
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 1056, in forward
[rank1]: return self.text_model(
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 947, in forward
[rank1]: hidden_states = self.embeddings(input_ids=input_ids, position_ids=position_ids)
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 283, in forward
[rank1]: raise ValueError(
[rank1]: ValueError: Sequence length must be less than max_position_embeddings (got sequence length: 77 and max_position_embeddings: 0
I changed max_position_embeddings in CLIPTextModel but it doesn't work:
text_encoder_one = class_one.from_pretrained(
args.pretrained_model_name_or_path, subfolder="text_encoder", revision=args.revision, variant=args.variant, max_position_embeddings=77,ignore_mismatched_sizes=True
)
accelerate launch train_dreambooth_flux.py
--pretrained_model_name_or_path=$MODEL_NAME
--instance_data_dir=$INSTANCE_DIR
--output_dir=$OUTPUT_DIR
--mixed_precision="bf16"
--instance_prompt="a photo of sks dog"
--resolution=1024
--train_batch_size=1
--guidance_scale=1
--gradient_accumulation_steps=4
--optimizer="prodigy"
--learning_rate=1.
--report_to="wandb"
--lr_scheduler="constant"
--lr_warmup_steps=0
--max_train_steps=500
--validation_prompt="A photo of sks dog in a bucket"
--validation_epochs=25
--seed="0"
--push_to_hub
Describe the bug
ValueError: Sequence length must be less than max_position_embeddings (got
sequence length
: 77 and max_position_embeddings: 0I used four A100 to full amount of fine-tuning Flux. 1 dev model, according to https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/README_flux.md
I used the toy dog dataset (5 images) for fine-tuning.
I ran into a problem with max_position_embeddings for CLIPTextModel:
Reproduction
[rank1]: Traceback (most recent call last):
[rank1]: File "/data/AIGC/diffusers/examples/dreambooth/train_dreambooth_flux.py", line 1812, in
[rank1]: main(args)
[rank1]: File "/data/AIGC/diffusers/examples/dreambooth/train_dreambooth_flux.py", line 1351, in main
[rank1]: instance_prompt_hidden_states, instance_pooled_prompt_embeds, instance_text_ids = compute_text_embeddings(
[rank1]: File "/data/AIGC/diffusers/examples/dreambooth/train_dreambooth_flux.py", line 1339, in compute_text_embeddings
[rank1]: prompt_embeds, pooled_prompt_embeds, text_ids = encode_prompt(
[rank1]: File "/data/AIGC/diffusers/examples/dreambooth/train_dreambooth_flux.py", line 963, in encode_prompt
[rank1]: pooled_prompt_embeds = _encode_prompt_with_clip(
[rank1]: File "/data/AIGC/diffusers/examples/dreambooth/train_dreambooth_flux.py", line 937, in _encode_prompt_with_clip
[rank1]: prompt_embeds = text_encoder(text_input_ids.to(device), output_hidden_states=False)
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 1056, in forward
[rank1]: return self.text_model(
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 947, in forward
[rank1]: hidden_states = self.embeddings(input_ids=input_ids, position_ids=position_ids)
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: File "/root/anaconda3/envs/flux/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 283, in forward
[rank1]: raise ValueError(
[rank1]: ValueError: Sequence length must be less than max_position_embeddings (got
sequence length
: 77 and max_position_embeddings: 0I changed max_position_embeddings in CLIPTextModel but it doesn't work:
text_encoder_one = class_one.from_pretrained(
args.pretrained_model_name_or_path, subfolder="text_encoder", revision=args.revision, variant=args.variant, max_position_embeddings=77,ignore_mismatched_sizes=True
)
My training script is as follows:
export MODEL_NAME="black-forest-labs/FLUX.1-dev"
export INSTANCE_DIR="dog"
export OUTPUT_DIR="trained-flux"
accelerate launch train_dreambooth_flux.py
--pretrained_model_name_or_path=$MODEL_NAME
--instance_data_dir=$INSTANCE_DIR
--output_dir=$OUTPUT_DIR
--mixed_precision="bf16"
--instance_prompt="a photo of sks dog"
--resolution=1024
--train_batch_size=1
--guidance_scale=1
--gradient_accumulation_steps=4
--optimizer="prodigy"
--learning_rate=1.
--report_to="wandb"
--lr_scheduler="constant"
--lr_warmup_steps=0
--max_train_steps=500
--validation_prompt="A photo of sks dog in a bucket"
--validation_epochs=25
--seed="0"
--push_to_hub
Logs
System Info
NVIDIA A100-SXM4-40GB, 40960 MiB
NVIDIA A100-SXM4-40GB, 40960 MiB
NVIDIA A100-SXM4-40GB, 40960 MiB
Who can help?
No response
The text was updated successfully, but these errors were encountered: