Skip to content

Commit

Permalink
Add fix for leak of ModelClient in Triton creation to changelog
Browse files Browse the repository at this point in the history
  • Loading branch information
piotrm-nvidia committed Nov 15, 2023
1 parent c624fcd commit f828f5d
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 1 deletion.
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,14 @@ limitations under the License.

# Changelog

## Unreleased

- Fix: Leak of ModelClient during Triton creation

[//]: <> (put here on external component update with short summary what change or link to changelog)

- Version of [Triton Inference Server](https://github.com/triton-inference-server/) embedded in wheel: [2.39.0](https://github.com/triton-inference-server/server/releases/tag/v2.39.0)

## 0.4.1 (2023-11-09)

- New: Place where workspaces with temporary Triton model repositories and communication file sockets can be configured by `$PYTRITON_HOME` environment variable
Expand Down
1 change: 0 additions & 1 deletion docs/known_issues.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,5 +21,4 @@ limitations under the License.
- Running multiple scripts hosting PyTriton on the same machine or container is not feasible.
- Deadlocks may occur in some models when employing the NCCL communication library and multiple Inference Callables are triggered concurrently. This issue can be observed when deploying multiple instances of the same model or multiple models within a single server script. Additional information about this issue can be found [here](https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/usage/communicators.html#using-multiple-nccl-communicators-concurrently).
- Enabling verbose logging may cause a significant performance drop in model inference.
- Creation of Triton object leaks single intance of ModelClient object. This is a known issue and will be fixed in future releases. This may cause gevent warnings to be printed to the console.
- GRPC ModelClient doesn't support timeouts for model configuration and model metadata requests due to a limitation in the underlying tritonclient library.

0 comments on commit f828f5d

Please sign in to comment.