Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get access to the vllm backend model #7916

Closed
lianyiyi opened this issue Jan 3, 2025 · 1 comment
Closed

How to get access to the vllm backend model #7916

lianyiyi opened this issue Jan 3, 2025 · 1 comment
Labels
question Further information is requested

Comments

@lianyiyi
Copy link

lianyiyi commented Jan 3, 2025

Hi, I want to get access to the backend model, do you know how to make it happen? Thanks!

@tanmayv25
Copy link
Contributor

The term model within Triton vLLM backend is quite overloaded now :P

The python model.py which serves vLLM engine within Triton's python backend is now called one of the python-based backend.
If you are searching for it, the file can be found here: https://github.com/triton-inference-server/vllm_backend/blob/main/src/model.py

See vLLM section within: https://github.com/triton-inference-server/backend?tab=readme-ov-file#where-can-i-find-all-the-backends-that-are-available-for-triton

This python-based backend loads model.json which is basically the EngineArgs that gets passed to the vLLM engine. See here to learn more: https://github.com/triton-inference-server/vllm_backend?tab=readme-ov-file#using-the-vllm-backend

The vLLM backend's model.json can point to a HF model via "model" field.

@tanmayv25 tanmayv25 added the question Further information is requested label Jan 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Development

No branches or pull requests

2 participants