How can I explicitly profile the memory used for each part of the model? #3080

dlin511 · 2025-01-23T17:44:44Z

dlin511
Jan 23, 2025

Hi,

Does lmdeploy/turbomind engine support any way to profile the memory usage of each part of the model?

For example, can I see how much memory the model is occupying vs the activation states vs the KV cache?

lvhan028 · 2025-02-21T04:37:21Z

Sorry, it doesn't

0 replies