Fix decoupled gpu output error handling #362

kthui · 2024-05-22T07:13:30Z

Previous PR: #361
Related PR: triton-inference-server/server#7258

Fix two things:

The decoupled GPU tensors output does not check if the GPU output buffers are successfully allocated. This adds the check.
If there is an error in buffer, it will throw an exception instead of logging the exception. The stub/parent turn flag will be updated correctly if an exception is raised or completed successfully.

Enhance one thing:

The error message returned to the request will match the full one printed to the log, instead of only returning the basic message from the model.

Next PRs:

src/response_sender.cc

kthui · 2024-05-31T22:32:00Z

Rebase onto https://github.com/triton-inference-server/python_backend/tree/jacky-res-sender-main for merging.

* Add response sender to non-decoupled models and unify data pipelines (#360) * Add response sender to non-decoupled model and unify data pipelines * Rename variable and class name * Fix decoupled batch statistics to account for implicit batch size (#361) * Fix decoupled gpu output error handling (#362) * Fix decoupled gpu output error handling * Return full error string upon exception from model * Response sender to check for improper non-decoupled model usage (#363) * Response sender to check for improper non-decoupled model usage * Force close response sender on exception * Rename functions

kthui mentioned this pull request May 22, 2024

Update expected error message triton-inference-server/server#7258

Merged

kthui requested a review from Tabrizian May 22, 2024 07:44

kthui mentioned this pull request May 22, 2024

Fix decoupled batch statistics to account for implicit batch size #361

Merged

kthui marked this pull request as ready for review May 23, 2024 05:40

Tabrizian reviewed May 28, 2024

View reviewed changes

src/response_sender.cc Show resolved Hide resolved

kthui added 2 commits May 31, 2024 15:31

Fix decoupled gpu output error handling

bef1eae

Return full error string upon exception from model

2a48913

kthui force-pushed the jacky-res-sender-fix-decouple-gpu-output-error branch from ca57c93 to 2a48913 Compare May 31, 2024 22:32

kthui requested a review from Tabrizian May 31, 2024 23:05

Tabrizian approved these changes May 31, 2024

View reviewed changes

This was referenced May 31, 2024

Add test for improper response sending from model triton-inference-server/server#7292

Merged

Response sender to check for improper non-decoupled model usage #363

Merged

kthui merged commit 9f2865d into jacky-res-sender-main Jun 3, 2024
3 checks passed

kthui deleted the jacky-res-sender-fix-decouple-gpu-output-error branch June 3, 2024 21:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix decoupled gpu output error handling #362

Fix decoupled gpu output error handling #362

kthui commented May 22, 2024 •

edited

Loading

kthui commented May 31, 2024

Fix decoupled gpu output error handling #362

Fix decoupled gpu output error handling #362

Conversation

kthui commented May 22, 2024 • edited Loading

kthui commented May 31, 2024

kthui commented May 22, 2024 •

edited

Loading