-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LoRA] Quanto Flux LoRA can't load #10512
Comments
Can you try with pip uninstall diffusers -y
pip install git+https://github.com/huggingface/diffusers |
The problem is still exit after this operation |
Do you have a minimal reproducible snippet? The provided one isn't minimal and self-contained. I keep asking for that because we have an integration test for Kohya LoRAs here: diffusers/tests/lora/test_lora_layers_flux.py Line 847 in 83ba01a
It was run yesterday, too, and it worked fine. |
This issue only occurs when loading LoRA after quantizing the FLUX transformer using optimum.quanto. If the model is not quantized, LoRA can be loaded normally. In version 0.31 of diffusers, LoRA could be loaded successfully even after quantization. |
|
@tyyff if you could help me with a minimally reproducible snippet that would be great, ideally with a supported quantization backend like |
I used the script and quantization method here : |
Can you solve the problems with flux-fp8 version? Thanks!!! |
Or can diffusers under 0.32.0 support flux redux? |
Just a combination of two examples from the article on using Flux import torch
from diffusers import BitsAndBytesConfig as DiffusersBitsAndBytesConfig, FluxPriorReduxPipeline, FluxControlPipeline, FluxTransformer2DModel, FluxPipeline
from transformers import BitsAndBytesConfig as BitsAndBytesConfig, T5EncoderModel
from diffusers.utils import load_image
from image_gen_aux import DepthPreprocessor
from diffusers.utils import load_image
from huggingface_hub import hf_hub_download
text_encoder_8bit = T5EncoderModel.from_pretrained(
"black-forest-labs/FLUX.1-dev",
subfolder="text_encoder_2",
quantization_config=DiffusersBitsAndBytesConfig(load_in_8bit=True),
torch_dtype=torch.float16,
)
transformer_8bit = FluxTransformer2DModel.from_pretrained(
"black-forest-labs/FLUX.1-dev",
subfolder="transformer",
quantization_config=DiffusersBitsAndBytesConfig(load_in_8bit=True),
torch_dtype=torch.float16,
)
control_pipe = FluxControlPipeline.from_pretrained(
"black-forest-labs/FLUX.1-dev",
text_encoder=text_encoder_8bit,
transformer=transformer_8bit,
torch_dtype=torch.float16,
device_map="balanced",
)
control_pipe.load_lora_weights("black-forest-labs/FLUX.1-Depth-dev-lora", adapter_name="depth")
control_pipe.load_lora_weights(
hf_hub_download("ByteDance/Hyper-SD", "Hyper-FLUX.1-dev-8steps-lora.safetensors"), adapter_name="hyper-sd"
)
control_pipe.set_adapters(["depth", "hyper-sd"], adapter_weights=[0.85, 0.125])
control_pipe.enable_model_cpu_offload()
prompt = "A robot made of exotic candies and chocolates of different kinds. The background is filled with confetti and celebratory gifts."
control_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/robot.png")
processor = DepthPreprocessor.from_pretrained("LiheYoung/depth-anything-large-hf")
control_image = processor(control_image)[0].convert("RGB")
image = control_pipe(
prompt=prompt,
control_image=control_image,
height=1024,
width=1024,
num_inference_steps=8,
guidance_scale=10.0,
generator=torch.Generator().manual_seed(42),
).images[0]
images[0].save("out.jpg") control_pipe.load_lora_weights("black-forest-labs/FLUX.1-Depth-dev-lora", adapter_name="depth")
File "/usr/local/lib/python3.10/dist-packages/diffusers/loaders/lora_pipeline.py", line 1856, in load_lora_weights
has_param_with_expanded_shape = self._maybe_expand_transformer_param_shape_or_error_(
File "/usr/local/lib/python3.10/dist-packages/diffusers/loaders/lora_pipeline.py", line 2359, in _maybe_expand_transformer_param_shape_or_error_
expanded_module = torch.nn.Linear(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/linear.py", line 99, in __init__
self.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs))
File "/usr/local/lib/python3.10/dist-packages/torch/nn/parameter.py", line 40, in __new__
return torch.Tensor._make_subclass(cls, data, requires_grad)
RuntimeError: Only Tensors of floating point and complex dtype can require gradients |
This is pip requirements.txt:
|
Tracking here: #10550. |
I tested with Error```bash Traceback (most recent call last): File "/home/sayak/diffusers/check_fp8.py", line 22, in pipe.load_lora_weights( File "/home/sayak/diffusers/src/diffusers/loaders/lora_pipeline.py", line 1846, in load_lora_weights self.load_lora_into_transformer( File "/home/sayak/diffusers/src/diffusers/loaders/lora_pipeline.py", line 1949, in load_lora_into_transformer incompatible_keys = set_peft_model_state_dict(transformer, state_dict, adapter_name, **peft_kwargs) File "/home/sayak/peft/src/peft/utils/save_and_load.py", line 445, in set_peft_model_state_dict load_result = model.load_state_dict(peft_model_state_dict, strict=False, assign=True) File "/home/sayak/.pyenv/versions/3.10.12/envs/diffusers/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2564, in load_state_dict load(self, state_dict) File "/home/sayak/.pyenv/versions/3.10.12/envs/diffusers/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2552, in load load(child, child_state_dict, child_prefix) # noqa: F821 File "/home/sayak/.pyenv/versions/3.10.12/envs/diffusers/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2552, in load load(child, child_state_dict, child_prefix) # noqa: F821 File "/home/sayak/.pyenv/versions/3.10.12/envs/diffusers/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2552, in load load(child, child_state_dict, child_prefix) # noqa: F821 [Previous line repeated 1 more time] File "/home/sayak/.pyenv/versions/3.10.12/envs/diffusers/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2535, in load module._load_from_state_dict( File "/home/sayak/.pyenv/versions/3.10.12/envs/diffusers/lib/python3.10/site-packages/optimum/quanto/nn/qmodule.py", line 160, in _load_from_state_dict deserialized_weight = WeightQBytesTensor.load_from_state_dict( File "/home/sayak/.pyenv/versions/3.10.12/envs/diffusers/lib/python3.10/site-packages/optimum/quanto/tensor/weights/qbytes.py", line 77, in load_from_state_dict inner_tensors_dict[name] = state_dict.pop(prefix + name) KeyError: 'time_text_embed.timestep_embedder.linear_1.base_layer.weight._data' ```Tracking it here: |
As I read the issue and PR you linked, the issue i'm facing is most likely due to quanto not being supported with peft. |
Yes, you're right. 4Bit support is being added in #10578. However, I just edited your issue title a bit to reflect that Quanto support needs to be added. Hope that is okay with you. |
And in 8 bit ? My issue is about 8 bit of qfloat8. It's okay for the name. |
Both 4bit and 8bit For 8bit, make sure you install |
And diffusers 0.32.1 or from source ? |
Source. |
Hi, we are getting this error only from diffusers > 0.31. same setup with 0.31> diffusers doesn't work with KeyError: 'single_transformer_blocks.0.attn.to_k.weight' |
Can you provide a reproducible snippet? |
I think he posted here |
not the same person :)
and the following to load the quantized encoder:
the repo we are using is Disty0/FLUX.1-dev-qint8 diffusers 0.31.0 |
Describe the bug
Cannot load LoRAs into quanto-quantized Flux.
Logs
System Info
Python 3.12
diffusers 0.32.0 (I tested 0.32.1 and install from git)
Who can help?
@sayakpaul
The text was updated successfully, but these errors were encountered: