Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xprobe/xinference:v0.14.0.post1 运行Alibaba-NLP/gte-Qwen2-7B-instruct报错 no moudle run install flash_attn #2022

Open
1 of 3 tasks
xujingsen521 opened this issue Aug 6, 2024 · 2 comments
Labels
Milestone

Comments

@xujingsen521
Copy link

System Info / 系統信息

centos7,docker:26.0.0

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

  • docker / docker
  • pip install / 通过 pip install 安装
  • installation from source / 从源码安装

Version info / 版本信息

xprobe/xinference:v0.14.0.post1

The command used to start Xinference / 用以启动 xinference 的命令

docker run -itd --name="xjs-inference"
-v ./:/app
-P -p 9995-9999:9995-9999
--gpus '"device=0,1"'
xprobe/xinference:v0.14.0.post1
xinference-local -H 0.0.0.0 --port 9997 --log-level debug

Reproduction / 复现过程

xinference register --model-type embedding --file gte-Qwen2-7B-instruct.json --persist
xinference launch
--model-name gte-Qwen2-7B-instruct
--model-type embedding

Expected behavior / 期待表现

能有一个装有nvcc和flash_attn 2.5.6版本以上的docker容器

@XprobeBot XprobeBot added the gpu label Aug 6, 2024
@XprobeBot XprobeBot added this to the v0.14.0 milestone Aug 6, 2024
@XprobeBot XprobeBot modified the milestones: v0.14, v0.15 Sep 3, 2024
@jim1997
Copy link

jim1997 commented Sep 20, 2024

同求

@ConleyKong
Copy link

xinference0.15.2可用的flash-attn安装包链接:https://pan.baidu.com/s/1OTOKLzKcSukjvDqQ-F6XVQ?pwd=1111
提取码:1111

@XprobeBot XprobeBot modified the milestones: v0.15, v0.16 Oct 30, 2024
@XprobeBot XprobeBot modified the milestones: v0.16, v1.x Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants