Fixes a memory leak during model unloading #3868

tanmayv25 · 2022-01-29T02:40:31Z

Before

 ==21418== Thread 32:
==21418== Invalid read of size 8
==21418== at 0x4CCA20F: nvidia::inferenceserver::RateLimiter::ModelInstanceContext::RequestRemoval() (in\
/opt/tritonserver/lib/libtritonserver.so)
==21418== by 0x4CCA285: nvidia::inferenceserver::RateLimiter::ModelInstanceContext::WaitForRemoval() (in\
/opt/tritonserver/lib/libtritonserver.so)
==21418== by 0x4CD1480: nvidia::inferenceserver::RateLimiter::UnregisterModel(nvidia::inferenceserver::T\
ritonModel const*) (in /opt/tritonserver/lib/libtritonserver.so)
==21418== by 0x4DC498A: nvidia::inferenceserver::TritonModel::~TritonModel() (in /opt/tritonserver/lib/l\
ibtritonserver.so)
==21418== by 0x4DC504C: nvidia::inferenceserver::TritonModel::~TritonModel() (in /opt/tritonserver/lib/l\
ibtritonserver.so)
==21418== by 0x4C61E46: std::thread::_State_impl<std::thread::_Invoker<std::tuple<nvidia::inferenceserve\
r::(anonymous namespace)::ModelDeleter::operator()(nvidia::inferenceserver::Model*)::{lambda()#1}> > >::M\
run() (in /opt/tritonserver/lib/libtritonserver.so)
==21418== by 0x5BE3DE3: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28)
==21418== by 0x578A608: start_thread (pthread_create.c:477)
==21418== by 0x5F7D292: clone (clone.S:95)
==21418== Address 0xf3d5d650 is 80 bytes inside a block of size 312 free'd
==21418== at 0x483F651: operator delete(void*) (vg_replace_malloc.c:923)
==21418== by 0x4DC48F0: nvidia::inferenceserver::TritonModel::~TritonModel() (in /opt/tritonserver/lib/l\
ibtritonserver.so)
==21418== by 0x4DC504C: nvidia::inferenceserver::TritonModel::~TritonModel() (in /opt/tritonserver/lib/l\
ibtritonserver.so)
==21418== by 0x4C61E46: std::thread::_State_impl<std::thread::_Invoker<std::tuple<nvidia::inferenceserve\
r::(anonymous namespace)::ModelDeleter::operator()(nvidia::inferenceserver::Model*)::{lambda()#1}> > >::M\
run() (in /opt/tritonserver/lib/libtritonserver.so)
==21418== by 0x5BE3DE3: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28)
==21418== by 0x578A608: start_thread (pthread_create.c:477)
==21418== by 0x5F7D292: clone (clone.S:95)

deadeyegoodwin

Is this related to #3777. If so, be sure to link it to this PR.

tanmayv25 · 2022-01-31T18:53:04Z

It fixes that specific invalid read but I still see the issue described in #3777 even with this fix.
I am still working on resolving the main issue.

GuanLuo · 2022-02-01T01:39:10Z

So the order between instance destruction and rate limiter removing the instance is arbitrary?

tanmayv25 · 2022-02-01T03:15:34Z

I am not sure what you mean by arbitrary. The TritonModelInstance destruction will not remove the entry of the ModelInstanceContext from the RateLimiter. If we remove the context before destroying the instance, backend thread waits over the condition variable that gets destroyed with the context. This can lead to undefined behaviour.

So, it is not arbitrary. Removing context from rate limiter should happen after instance destruction(backend thread is stopped and joined) as is happening in this case. The context is a wrapper over TritonModelInstance.

tanmayv25 requested review from deadeyegoodwin, CoderHam and GuanLuo January 29, 2022 02:40

deadeyegoodwin reviewed Jan 31, 2022

View reviewed changes

tanmayv25 requested a review from deadeyegoodwin January 31, 2022 18:53

tanmayv25 mentioned this pull request Jan 31, 2022

free() invalid pointer #3777

Closed

tanmayv25 force-pushed the tanmayv-inv-read branch from 47742c1 to 2964705 Compare January 31, 2022 23:37

tanmayv25 added 3 commits February 1, 2022 16:26

Fix invalid read in model context clean-up

0ff506a

Fix format

bd1cee3

Update the copyrights

55f2145

tanmayv25 force-pushed the tanmayv-inv-read branch from c9a37e5 to d42e93a Compare February 2, 2022 00:29

Add clarifying comment

d42e93a

GuanLuo approved these changes Feb 2, 2022

View reviewed changes

CoderHam approved these changes Feb 2, 2022

View reviewed changes

tanmayv25 merged commit 555acd9 into main Feb 3, 2022

tanmayv25 deleted the tanmayv-inv-read branch February 3, 2022 02:40

dyastremsky mentioned this pull request Aug 15, 2022

Does model should instantly free memory after unload? #4764

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes a memory leak during model unloading #3868

Fixes a memory leak during model unloading #3868

tanmayv25 commented Jan 29, 2022 •

edited

Loading

deadeyegoodwin left a comment

tanmayv25 commented Jan 31, 2022

GuanLuo commented Feb 1, 2022

tanmayv25 commented Feb 1, 2022 •

edited

Loading

Fixes a memory leak during model unloading #3868

Fixes a memory leak during model unloading #3868

Conversation

tanmayv25 commented Jan 29, 2022 • edited Loading

Before

deadeyegoodwin left a comment

Choose a reason for hiding this comment

tanmayv25 commented Jan 31, 2022

GuanLuo commented Feb 1, 2022

tanmayv25 commented Feb 1, 2022 • edited Loading

tanmayv25 commented Jan 29, 2022 •

edited

Loading

tanmayv25 commented Feb 1, 2022 •

edited

Loading