Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RewardModel.from_pretrained() loads redundant weights (incurs extra ~30GB of RAM) #75

Open
angie-chen55 opened this issue Sep 20, 2023 · 0 comments

Comments

@angie-chen55
Copy link

Hi,

Whenever a saved RewardModel is loaded via RewardModel.from_pretrained(model_path, flash_attn=True, fp16=False, bf16=True, low_cpu_mem_usage=True), it downloads the entire sharded checkpoint (https://github.com/huggingface/transformers/blob/v4.33.2/src/transformers/modeling_utils.py#L2876), which is already ~30GB because it contains all the weights of the reward model, including both the backbone model and the reward head. It then calls RewardModel.__init__() (via this line https://github.com/huggingface/transformers/blob/v4.33.2/src/transformers/modeling_utils.py#L2966), which loads all the weights of the backbone model (SFT10K, another ~30GB). Surely loading a pretrained model shouldn't require loading the backbone model weights twice?

Thanks!

lolipopshock pushed a commit to lolipopshock/alpaca_farm that referenced this issue Sep 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant