-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate vLLM Evaluator #23
Labels
enhancement
New feature or request
Comments
Initial explorationSeems like vLLM can run inside a Ray cluster just fine. Basic working code example
Usage:
|
Notes on initial exploration:
I think this approach is good enough to use. vLLMEvaluator can be a simple usage of vllm, but will need some adapters for sampling and returning logprobs. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
vLLM is a high-throughput LLM evaluator which runs on HuggingFace models, performing various kinds of model sharding across GPUs using Ray backend.
In its basic form, vLLM is a great speedup over AccelerateEvaluator, which is quite slow.
Basic requirements:
The text was updated successfully, but these errors were encountered: