From f828f5d335c45f470f81e94cdf994073f995d583 Mon Sep 17 00:00:00 2001
From: Piotr Marcinkiewicz <piotrm@nvidia.com>
Date: Wed, 15 Nov 2023 07:41:39 -0800
Subject: [PATCH] Add fix for leak of ModelClient in Triton creation to
 changelog

---
 CHANGELOG.md         | 8 ++++++++
 docs/known_issues.md | 1 -
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index c88f75b..f97868e 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -16,6 +16,14 @@ limitations under the License.
 
 # Changelog
 
+## Unreleased
+
+- Fix: Leak of ModelClient during Triton creation
+
+[//]: <> (put here on external component update with short summary what change or link to changelog)
+
+- Version of [Triton Inference Server](https://github.com/triton-inference-server/) embedded in wheel: [2.39.0](https://github.com/triton-inference-server/server/releases/tag/v2.39.0)
+
 ## 0.4.1 (2023-11-09)
 
 - New: Place where workspaces with temporary Triton model repositories and communication file sockets can be configured by `$PYTRITON_HOME` environment variable
diff --git a/docs/known_issues.md b/docs/known_issues.md
index ca03da0..8246444 100644
--- a/docs/known_issues.md
+++ b/docs/known_issues.md
@@ -21,5 +21,4 @@ limitations under the License.
 - Running multiple scripts hosting PyTriton on the same machine or container is not feasible.
 - Deadlocks may occur in some models when employing the NCCL communication library and multiple Inference Callables are triggered concurrently. This issue can be observed when deploying multiple instances of the same model or multiple models within a single server script. Additional information about this issue can be found [here](https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/usage/communicators.html#using-multiple-nccl-communicators-concurrently).
 - Enabling verbose logging may cause a significant performance drop in model inference.
-- Creation of Triton object leaks single intance of ModelClient object. This is a known issue and will be fixed in future releases. This may cause gevent warnings to be printed to the console.
 - GRPC ModelClient doesn't support timeouts for model configuration and model metadata requests due to a limitation in the underlying tritonclient library.
\ No newline at end of file