-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for nvclip endpoint #147
Conversation
baad3cd
to
9472697
Compare
9472697
to
3e72239
Compare
@@ -64,3 +64,9 @@ def _select_model_name(self, config: InputsConfig, index: int) -> str: | |||
raise GenAIPerfException( | |||
f"Model selection strategy '{config.model_selection_strategy}' is unsupported" | |||
) | |||
|
|||
def _add_request_params( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gets reused among converters, so I moved it to the base converter class. Other converts can reimplement it for custom logic.
3e72239
to
0353bb9
Compare
################################################################ | ||
# Triton backends | ||
################################################################ | ||
TENSORRTLLM = auto() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the logic added here expected the options to only list the Triton backends. However, because that was unclear when adding OutputFormats here, it got out of sync.
We might need a better approach long-term as its own ticket, but I did clean-up here to fix the issue for now. See help prompt with the correct backends displayed below.
@@ -634,7 +638,7 @@ def _add_endpoint_args(parser): | |||
endpoint_group.add_argument( | |||
"--backend", | |||
type=str, | |||
choices=utils.get_enum_names(ic.OutputFormat)[2:], | |||
choices=utils.get_enum_names(ic.OutputFormat)[0:2], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add support for nvclip converter Add support for nvclip converter Add support for nvclip converter Add support for nvclip converter
0353bb9
to
986fe50
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, I haven't seen anything wrong with this PR.
One general comment however, which isn't related to this PR in particular, but I am starting to wish there was a bit more docstrings throughout the codebase, to better describe and document the APIs that we create, not only for the external users, but also for ourselves, when we have to maintain code in the future.
Don't take this as a request to document this one PR thoroughly, but this may be something to consider for future work / refactoring.
As an example, I feel I haven't documented enough my own python script that I submitted in third_party, tho I'd defend against my own argument saying this is a temporary fix until abseil's fix trickled down to us:
https://github.com/triton-inference-server/third_party/blob/main/tools/patch.py
Thanks for reviewing and providing feedback! I'm happy to talk about how to best do this for GenAI-Perf and our codebase in general sometime. The base converter class has some docstrings to document converter functions, though we can probably do more. Finding clearer ways to document is always good. |
Add support for NVClip NIM endpoint.
Benchmarking NVCLip: