You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It’s most likely an out-of-memory error running on MAC with CPU, as discussed in this issue: huggingface/diffusers#5894
For running a Large Language Model, a dedicated GPU is highly recommended. If you don’t have access to one, I suggest using Google Colab, which offers a free tier with GPU support.
M1 Macbook Pro - CPU
Logs:
Code/current_projects/CAG main python3 ./kvcache.py --kvcache file --dataset "squad-train" --similarity bertscore --maxKnowledge 5 --maxParagraph 100 --maxQuestion 1000 --modelname "meta-llama/Llama-3.1-8B-Instruct" --randomSeed 0 --output "./result_kvcache.txt" 2025-01-22 13:43:55,158 - INFO - Using device: cpu 2025-01-22 13:43:55,184 - INFO - Use pytorch device_name: mps 2025-01-22 13:43:55,184 - INFO - Load pretrained SentenceTransformer: all-MiniLM-L6-v2 maxKnowledge 5 maxParagraph 100 maxQuestion 1000 randomeSeed 0 Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:14<00:00, 3.73s/it] 2025-01-22 13:44:12,200 - WARNING - Some parameters are on the meta device because they were offloaded to the disk. max_knowledge 5 max_paragraph 100 max_questions 1000 Traceback (most recent call last): File "/Users/georgeli/Documents/Code/current_projects/CAG/./kvcache.py", line 500, in <module> kvcache_test(args) File "/Users/georgeli/Documents/Code/current_projects/CAG/./kvcache.py", line 318, in kvcache_test knowledge_cache, prepare_time = prepare_kvcache(knowledges, filepath=kvcache_path, answer_instruction=answer_instruction) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/georgeli/Documents/Code/current_projects/CAG/./kvcache.py", line 190, in prepare_kvcache kv = preprocess_knowledge(model, tokenizer, knowledges) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/georgeli/Documents/Code/current_projects/CAG/./kvcache.py", line 118, in preprocess_knowledge outputs = model( ^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/accelerate/hooks.py", line 170, in new_forward output = module._old_forward(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 1190, in forward outputs = self.model( ^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 945, in forward layer_outputs = decoder_layer( ^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/accelerate/hooks.py", line 170, in new_forward output = module._old_forward(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 676, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( ^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/accelerate/hooks.py", line 170, in new_forward output = module._old_forward(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 602, in forward attn_output = torch.nn.functional.scaled_dot_product_attention( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Invalid buffer size: 117.59 GB
The text was updated successfully, but these errors were encountered: