add xpu check in `get_quantized_model_device_map` #3397

faaany · 2025-02-14T09:37:36Z

What does this PR do?

When going through the following code example in Model quantization doc, the below code returns error that no GPU is found. This PR can fixes this error.

from accelerate import init_empty_weights
from mingpt.model import GPT
import torch

model_config = GPT.get_default_config()
model_config.model_type = 'gpt2-xl'
model_config.vocab_size = 50257
model_config.block_size = 1024

with init_empty_weights():
    empty_model = GPT(model_config)
    
from huggingface_hub import snapshot_download
weights_location = snapshot_download(repo_id="marcsun13/gpt2-xl-linear-sharded")

from accelerate.utils import BnbQuantizationConfig
bnb_quantization_config = BnbQuantizationConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4")

# Generate the device map
from accelerate.utils import load_and_quantize_model
quantized_model = load_and_quantize_model(empty_model, weights_location=weights_location, bnb_quantization_config=bnb_quantization_config)

Another issue I found is that with "device_map="auto" the following error returns:
I got the following error:

Traceback (most recent call last):
  File "/home/sdp/fanli/doc_to_fix.py", line 21, in <module>
    quantized_model = load_and_quantize_model(empty_model, no_split_module_classes=["GPT"], weights_location=weights_location, bnb_quantization_config=bnb_quantization_config, device_map="auto")
  File "/home/sdp/fanli/accelerate/src/accelerate/utils/bnb.py", line 184, in load_and_quantize_model
    load_checkpoint_in_model(
  File "/home/sdp/fanli/accelerate/src/accelerate/utils/modeling.py", line 1903, in load_checkpoint_in_model
    raise ValueError(f"{param_name} doesn't have any device set.")
ValueError: transformer.h.5.attn.bias doesn't have any device set.

It seems that GPT should be a no_split_module (see huggingface/transformers#23294). So I removed the device_map="auto", but I am not 100% sure. If you have any ideas or suggestion, pls let me know and I can further investigate it.

cc @SunMarc and @muellerzr

SunMarc · 2025-02-14T13:44:05Z

Thanks !

HuggingFaceDocBuilderDev · 2025-02-14T13:47:42Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

enable xpu

67ff041

SunMarc approved these changes Feb 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add xpu check in `get_quantized_model_device_map` #3397

add xpu check in `get_quantized_model_device_map` #3397

faaany commented Feb 14, 2025 •

edited

Loading

SunMarc commented Feb 14, 2025

HuggingFaceDocBuilderDev commented Feb 14, 2025

add xpu check in get_quantized_model_device_map #3397

Are you sure you want to change the base?

add xpu check in get_quantized_model_device_map #3397

Conversation

faaany commented Feb 14, 2025 • edited Loading

What does this PR do?

SunMarc commented Feb 14, 2025

HuggingFaceDocBuilderDev commented Feb 14, 2025

add xpu check in `get_quantized_model_device_map` #3397

add xpu check in `get_quantized_model_device_map` #3397

faaany commented Feb 14, 2025 •

edited

Loading