Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

昇腾910B上进行推理时,调不了NPU卡 #6671

Closed
1 task done
winni0 opened this issue Jan 16, 2025 · 3 comments
Closed
1 task done

昇腾910B上进行推理时,调不了NPU卡 #6671

winni0 opened this issue Jan 16, 2025 · 3 comments
Labels
npu This problem is related to NPU devices solved This problem has been already solved

Comments

@winni0
Copy link

winni0 commented Jan 16, 2025

Reminder

  • I have read the above rules and searched the existing issues.

System Info

  • llamafactory version: 0.9.2.dev0
  • Platform: Linux-5.15.0-127-generic-aarch64-with-glibc2.35
  • Python version: 3.10.15
  • PyTorch version: 2.3.1 (NPU)
  • Transformers version: 4.46.1
  • Datasets version: 3.1.0
  • Accelerate version: 1.0.1
  • PEFT version: 0.12.0
  • TRL version: 0.9.6
  • NPU type: Ascend910B3
  • CANN version: 8.0.RC3
  • DeepSpeed version: 0.14.4

Reproduction

User: 你是谁?
Assistant: [W compiler_depend.ts:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
/usr/local/python3.10/lib/python3.10/site-packages/transformers/generation/logits_process.py:1634: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at build/CMakeFiles/torch_npu.dir/compiler_depend.ts:74.)
  scores_processed = torch.where(scores != scores, 0.0, scores)

Others

执行的命令是:ASCEND_RT_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli chat
--model_name_or_path /LLaMA-Factory-main/model/Qwen2-VL-2B-Instruct
--template qwen2_vl
--infer_backend huggingface

@winni0 winni0 added bug Something isn't working pending This problem is yet to be addressed labels Jan 16, 2025
@github-actions github-actions bot added the npu This problem is related to NPU devices label Jan 16, 2025
@1737686924
Copy link

Reminder

  • I have read the above rules and searched the existing issues.

System Info

  • llamafactory version: 0.9.2.dev0
  • Platform: Linux-5.15.0-127-generic-aarch64-with-glibc2.35
  • Python version: 3.10.15
  • PyTorch version: 2.3.1 (NPU)
  • Transformers version: 4.46.1
  • Datasets version: 3.1.0
  • Accelerate version: 1.0.1
  • PEFT version: 0.12.0
  • TRL version: 0.9.6
  • NPU type: Ascend910B3
  • CANN version: 8.0.RC3
  • DeepSpeed version: 0.14.4

Reproduction

User: 你是谁?
Assistant: [W compiler_depend.ts:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
/usr/local/python3.10/lib/python3.10/site-packages/transformers/generation/logits_process.py:1634: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at build/CMakeFiles/torch_npu.dir/compiler_depend.ts:74.)
  scores_processed = torch.where(scores != scores, 0.0, scores)

Others

执行的命令是:ASCEND_LAUNCH_BLOCKING=0,1,2,3 llamafactory-cli chat --model_name_or_path /LLaMA-Factory-main/model/Qwen2-VL-2B-Instruct --template qwen2_vl --infer_backend huggingface

ASCEND_RT_VISIBLE_DEVICE,指定npu显卡是这个

Sorry, something went wrong.

@winni0
Copy link
Author

winni0 commented Jan 16, 2025

改了,但是还是调不了卡

@codemayq
Copy link
Collaborator

有这个 Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback) 说明已经在调用了呀

@hiyouga hiyouga added solved This problem has been already solved and removed bug Something isn't working pending This problem is yet to be addressed labels Jan 17, 2025
@hiyouga hiyouga closed this as completed Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
npu This problem is related to NPU devices solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

4 participants