why does mem_alloc take very long time sometimes, and it's a random behavior #363
-
In my for loop, most calls to mem_alloc only take less than 1ms, but several calls take longer than 200ms, what happens? How can I fix or avoid it? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Memory allocation is shockingly slow in CUDA at times. As a user, I am not aware of any way to fix that, however PyCUDA does offer a way to circumvent it, through the use of memory pools: https://documen.tician.de/pycuda/util.html#device-based-memory-pool These work by simply not returning freed memory to the system, but instead retaining the memory and reusing allocations when requests for similarly-sized memory occur later. |
Beta Was this translation helpful? Give feedback.
Memory allocation is shockingly slow in CUDA at times. As a user, I am not aware of any way to fix that, however PyCUDA does offer a way to circumvent it, through the use of memory pools:
https://documen.tician.de/pycuda/util.html#device-based-memory-pool
These work by simply not returning freed memory to the system, but instead retaining the memory and reusing allocations when requests for similarly-sized memory occur later.