Add fix for leak of ModelClient in Triton creation to changelog

triton-inference-server · Nov 15, 2023 · f828f5d · f828f5d
1 parent c624fcd
commit f828f5d
Show file tree

Hide file tree

Showing 2 changed files with 8 additions and 1 deletion.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -16,6 +16,14 @@ limitations under the License.
 
 # Changelog
 
+## Unreleased
+
+- Fix: Leak of ModelClient during Triton creation
+
+[//]: <> (put here on external component update with short summary what change or link to changelog)
+
+- Version of [Triton Inference Server](https://github.com/triton-inference-server/) embedded in wheel: [2.39.0](https://github.com/triton-inference-server/server/releases/tag/v2.39.0)
+
 ## 0.4.1 (2023-11-09)
 
 - New: Place where workspaces with temporary Triton model repositories and communication file sockets can be configured by `$PYTRITON_HOME` environment variable

diff --git a/docs/known_issues.md b/docs/known_issues.md
@@ -21,5 +21,4 @@ limitations under the License.
 - Running multiple scripts hosting PyTriton on the same machine or container is not feasible.
 - Deadlocks may occur in some models when employing the NCCL communication library and multiple Inference Callables are triggered concurrently. This issue can be observed when deploying multiple instances of the same model or multiple models within a single server script. Additional information about this issue can be found [here](https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/usage/communicators.html#using-multiple-nccl-communicators-concurrently).
 - Enabling verbose logging may cause a significant performance drop in model inference.
-- Creation of Triton object leaks single intance of ModelClient object. This is a known issue and will be fixed in future releases. This may cause gevent warnings to be printed to the console.
 - GRPC ModelClient doesn't support timeouts for model configuration and model metadata requests due to a limitation in the underlying tritonclient library.