Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for nvclip endpoint #147

Merged
merged 1 commit into from
Oct 22, 2024
Merged

Add support for nvclip endpoint #147

merged 1 commit into from
Oct 22, 2024

Conversation

dyastremsky
Copy link
Contributor

@dyastremsky dyastremsky commented Oct 22, 2024

Add support for NVClip NIM endpoint.

Benchmarking NVCLip:
image

@@ -64,3 +64,9 @@ def _select_model_name(self, config: InputsConfig, index: int) -> str:
raise GenAIPerfException(
f"Model selection strategy '{config.model_selection_strategy}' is unsupported"
)

def _add_request_params(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets reused among converters, so I moved it to the base converter class. Other converts can reimplement it for custom logic.

################################################################
# Triton backends
################################################################
TENSORRTLLM = auto()
Copy link
Contributor Author

@dyastremsky dyastremsky Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the logic added here expected the options to only list the Triton backends. However, because that was unclear when adding OutputFormats here, it got out of sync.

We might need a better approach long-term as its own ticket, but I did clean-up here to fix the issue for now. See help prompt with the correct backends displayed below.
image

@@ -634,7 +638,7 @@ def _add_endpoint_args(parser):
endpoint_group.add_argument(
"--backend",
type=str,
choices=utils.get_enum_names(ic.OutputFormat)[2:],
choices=utils.get_enum_names(ic.OutputFormat)[0:2],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need a better approach long-term as its own ticket, but I did clean-up here to fix the issue for now. See help prompt with the correct backends displayed below.
image

Add support for nvclip converter

Add support for nvclip converter

Add support for nvclip converter

Add support for nvclip converter
Copy link
Contributor

@nicolasnoble nicolasnoble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, I haven't seen anything wrong with this PR.

One general comment however, which isn't related to this PR in particular, but I am starting to wish there was a bit more docstrings throughout the codebase, to better describe and document the APIs that we create, not only for the external users, but also for ourselves, when we have to maintain code in the future.

Don't take this as a request to document this one PR thoroughly, but this may be something to consider for future work / refactoring.

As an example, I feel I haven't documented enough my own python script that I submitted in third_party, tho I'd defend against my own argument saying this is a temporary fix until abseil's fix trickled down to us:

https://github.com/triton-inference-server/third_party/blob/main/tools/patch.py

@dyastremsky
Copy link
Contributor Author

This looks good to me, I haven't seen anything wrong with this PR.

One general comment however, which isn't related to this PR in particular, but I am starting to wish there was a bit more docstrings throughout the codebase, to better describe and document the APIs that we create, not only for the external users, but also for ourselves, when we have to maintain code in the future.

Don't take this as a request to document this one PR thoroughly, but this may be something to consider for future work / refactoring.

As an example, I feel I haven't documented enough my own python script that I submitted in third_party, tho I'd defend against my own argument saying this is a temporary fix until abseil's fix trickled down to us:

https://github.com/triton-inference-server/third_party/blob/main/tools/patch.py

Thanks for reviewing and providing feedback! I'm happy to talk about how to best do this for GenAI-Perf and our codebase in general sometime. The base converter class has some docstrings to document converter functions, though we can probably do more. Finding clearer ways to document is always good.

@dyastremsky dyastremsky merged commit 55f1f34 into main Oct 22, 2024
6 checks passed
@dyastremsky dyastremsky deleted the dyas-nvclip branch October 22, 2024 23:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants