该模型能否在llama.cpp上运行 #42
Answered
by
OleehyO
KagaJiankui
asked this question in
Q&A
-
对于生产环境部署, 能否使用llama.cpp/fastllm.cpp等纯C++框架运行TexTeller模型? 希望能提供便于部署的q4/q8模型和actions/dockerfile之类部署batch文件 |
Beta Was this translation helpful? Give feedback.
Answered by
OleehyO
Mar 5, 2024
Replies: 1 comment
-
TexTeller的架构是自定义的,所以llama.cpp/fastllm.cpp应该是没有兼容的,如果想转到纯C++框架可以考虑把模型转ONNX然后使用TensorRT进行推理。 另外,这一版的TexTeller在某些场景下还不够好,所以还没有考虑q4/q8,但是下一版应该会有。 |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
OleehyO
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
TexTeller的架构是自定义的,所以llama.cpp/fastllm.cpp应该是没有兼容的,如果想转到纯C++框架可以考虑把模型转ONNX然后使用TensorRT进行推理。
另外,这一版的TexTeller在某些场景下还不够好,所以还没有考虑q4/q8,但是下一版应该会有。