refactor: Delete response factory after sending complete final #373
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does the PR do?
Delete the response factory after the response sender sends a response with complete final flag. If the Python process did not get a chance to garbage collect the response sender before unloading the model, the response factory would be destructed which allows the model to gracefully unload.
Checklist
<commit_type>: <Title>
Commit Type:
Check the conventional commit type
box here and add the label to the github PR.
Related PRs:
triton-inference-server/server#7504
triton-inference-server/vllm_backend#55
Where should the reviewer start?
N/A
Test plan:
The PR refactors how a response factory object can be deleted, it is neither a feature nor a bug fix, so existing tests should be sufficient to cover any regression.
A test enhancement is added for response sender to ensure the deleted response factory cannot be accidentally dereferenced. See triton-inference-server/server#7504 for more details.
Caveats:
N/A
Background
N/A
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
N/A