We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi, I want to get access to the backend model, do you know how to make it happen? Thanks!
The text was updated successfully, but these errors were encountered:
The term model within Triton vLLM backend is quite overloaded now :P
The python model.py which serves vLLM engine within Triton's python backend is now called one of the python-based backend. If you are searching for it, the file can be found here: https://github.com/triton-inference-server/vllm_backend/blob/main/src/model.py
See vLLM section within: https://github.com/triton-inference-server/backend?tab=readme-ov-file#where-can-i-find-all-the-backends-that-are-available-for-triton
This python-based backend loads model.json which is basically the EngineArgs that gets passed to the vLLM engine. See here to learn more: https://github.com/triton-inference-server/vllm_backend?tab=readme-ov-file#using-the-vllm-backend
The vLLM backend's model.json can point to a HF model via "model" field.
Sorry, something went wrong.
No branches or pull requests
Hi, I want to get access to the backend model, do you know how to make it happen? Thanks!
The text was updated successfully, but these errors were encountered: