-
Notifications
You must be signed in to change notification settings - Fork 0
exploring performance #54
Comments
Status:
To do:
|
Initial Performance testing on recilinear domain (sinusoidal) on rose. Baseline is the is same domain using the default GrGeomInLoopBoxes macro:
No parallelism (exploring overhead purely from tiled looping)
Parallelism in tiles
Parallelism in tiles
Parallelism In tiles
Parallelism In tiles
Parallelism in tiles
Parallelism in tiles
Parallelism in tiles
Parallelism in tiles
Parallelism in tiles
Parallelism over tiles
Parallelism over tiles
Parallelism over tiles
Parallelism over tiles
Parallelism over tiles
Parallelism over tiles
Parallelism over tiles
Parallelism over tiles
Parallelism over tiles
|
Main Observations
Idea is to choose a single size that maximizes both. (Could we also choose a better box size that may be more optimal for one case if the gains are much better of that case than the losses in the alternative case? Eg, if at size 16 both in and over boxes achieve 1.32x speedup, but at size 32, in boxes achieves 2.0x speedup but over boxes only achieves 1.1x speedup) Future work: Soon:
Test on Ocelote? |
Should parallelism be done over boxes or within?
This depends on the configuration of the domain.
The text was updated successfully, but these errors were encountered: