You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi!
Tracking amount of processed GFLOPs per each computing device is a nice feature of StarPU profiling. However, tracking memory accesses is also very helpful for memory-bound tasks. This is totally separate from bus profiling: I would like to check how badly my CPU and CUDA kernels are accessing memory during task execution. Each task will get an additional value, a number of total reads and writes in bytes. And an overall profiling statistics, pronted by StarPU, will display amount of reached GFLOPs/s along with reached GBs/s of memory accesses for each device.
The text was updated successfully, but these errors were encountered:
Actually, I meant a member of struct starpu_task, that I fill myself through 'starpu_task_insert(..., STARPU_FLOPS, nflops,...)' utility. Adding memops (or whatever name it shall be given) alongside flops in perfmodel files will help tracking slowly performing memory-bound operations.
Adding memops (or whatever name it shall be given) alongside flops in perfmodel files will help tracking slowly performing memory-bound operations
Right. Actually the flops field could very well be filled from PAPI too, so adding bytes_read and bytes_written fields, handled similarly to flops, would make sense already.
Hi!
Tracking amount of processed GFLOPs per each computing device is a nice feature of StarPU profiling. However, tracking memory accesses is also very helpful for memory-bound tasks. This is totally separate from bus profiling: I would like to check how badly my CPU and CUDA kernels are accessing memory during task execution. Each task will get an additional value, a number of total reads and writes in bytes. And an overall profiling statistics, pronted by StarPU, will display amount of reached GFLOPs/s along with reached GBs/s of memory accesses for each device.
The text was updated successfully, but these errors were encountered: