Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add xpu check in get_quantized_model_device_map #3397

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

faaany
Copy link
Contributor

@faaany faaany commented Feb 14, 2025

What does this PR do?

When going through the following code example in Model quantization doc, the below code returns error that no GPU is found. This PR can fixes this error.

from accelerate import init_empty_weights
from mingpt.model import GPT
import torch

model_config = GPT.get_default_config()
model_config.model_type = 'gpt2-xl'
model_config.vocab_size = 50257
model_config.block_size = 1024

with init_empty_weights():
    empty_model = GPT(model_config)
    
from huggingface_hub import snapshot_download
weights_location = snapshot_download(repo_id="marcsun13/gpt2-xl-linear-sharded")

from accelerate.utils import BnbQuantizationConfig
bnb_quantization_config = BnbQuantizationConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4")

# Generate the device map
from accelerate.utils import load_and_quantize_model
quantized_model = load_and_quantize_model(empty_model, weights_location=weights_location, bnb_quantization_config=bnb_quantization_config)

Another issue I found is that with "device_map="auto" the following error returns:
I got the following error:

Traceback (most recent call last):
  File "/home/sdp/fanli/doc_to_fix.py", line 21, in <module>
    quantized_model = load_and_quantize_model(empty_model, no_split_module_classes=["GPT"], weights_location=weights_location, bnb_quantization_config=bnb_quantization_config, device_map="auto")
  File "/home/sdp/fanli/accelerate/src/accelerate/utils/bnb.py", line 184, in load_and_quantize_model
    load_checkpoint_in_model(
  File "/home/sdp/fanli/accelerate/src/accelerate/utils/modeling.py", line 1903, in load_checkpoint_in_model
    raise ValueError(f"{param_name} doesn't have any device set.")
ValueError: transformer.h.5.attn.bias doesn't have any device set.

It seems that GPT should be a no_split_module (see huggingface/transformers#23294). So I removed the device_map="auto", but I am not 100% sure. If you have any ideas or suggestion, pls let me know and I can further investigate it.

cc @SunMarc and @muellerzr

@SunMarc
Copy link
Member

SunMarc commented Feb 14, 2025

Thanks !

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants