[Ascend] Enable support for 310p NPUs #3532

zer0py2c · 2024-09-22T05:30:03Z

⚠️Before loading ChatGLM3-6B, you need to modify the model code file!

...
# @torch.jit.script
def apply_rotary_pos_emb(x: torch.Tensor, rope_cache: torch.Tensor) -> torch.Tensor:
    ...

ℹ️Recommend that build a Docker image for environment setup.

python3 -m fastchat.serve.cli --model-path /opt/models/vicuna-7b-v1.3/ --device npu

ASCEND_RT_VISIBLE_DEVICES=0,1 python3 -m fastchat.serve.cli --model-path /opt/models/vicuna-7b-v1.3/ --num-gpus 2 --device npu --max-gpu-memory 12GiB

Adapted Vicuna and ChatGLM models on 310p NPU

1141f19

zer0py2c closed this Sep 22, 2024

Fixed module import error

6b65541

zer0py2c reopened this Sep 22, 2024

zer0py2c force-pushed the main branch from 0f122ae to 6b65541 Compare September 22, 2024 06:15

Added judgment of chip type

0349c1b

zer0py2c closed this Sep 23, 2024

Provide feedback