Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix decoupled gpu output error handling #362

Merged

Conversation

kthui
Copy link
Contributor

@kthui kthui commented May 22, 2024

Previous PR: #361
Related PR: triton-inference-server/server#7258

Fix two things:

  • The decoupled GPU tensors output does not check if the GPU output buffers are successfully allocated. This adds the check.
  • If there is an error in buffer, it will throw an exception instead of logging the exception. The stub/parent turn flag will be updated correctly if an exception is raised or completed successfully.

Enhance one thing:

  • The error message returned to the request will match the full one printed to the log, instead of only returning the basic message from the model.

Next PRs:

@kthui
Copy link
Contributor Author

kthui commented May 31, 2024

@kthui kthui force-pushed the jacky-res-sender-fix-decouple-gpu-output-error branch from ca57c93 to 2a48913 Compare May 31, 2024 22:32
@kthui kthui requested a review from Tabrizian May 31, 2024 23:05
@kthui kthui merged commit 9f2865d into jacky-res-sender-main Jun 3, 2024
3 checks passed
@kthui kthui deleted the jacky-res-sender-fix-decouple-gpu-output-error branch June 3, 2024 21:16
kthui added a commit that referenced this pull request Jun 6, 2024
* Add response sender to non-decoupled models and unify data pipelines (#360)

* Add response sender to non-decoupled model and unify data pipelines

* Rename variable and class name

* Fix decoupled batch statistics to account for implicit batch size (#361)

* Fix decoupled gpu output error handling (#362)

* Fix decoupled gpu output error handling

* Return full error string upon exception from model

* Response sender to check for improper non-decoupled model usage (#363)

* Response sender to check for improper non-decoupled model usage

* Force close response sender on exception

* Rename functions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants