Releases · sphexa-org/sphexa

Improve tree refinement for remote LET nodes such that fewer remote nodes are needed to ensure successful gravity traversal. Improves performance due to smaller amount of communication needed
injectKeys on GPUs. This tree resolution-enforcement mechanism is needed more frequently than previously thought,
hence it made sense to port it to GPU.

Fixes:

Fix compilation issues with CUDA 12.4 related to thrust::device_vector

Assets 2

20 Nov 17:11

sekelle

v0.81

5975c17

Fields in 32/64 bit precision

Performance enhancements:

By default: keep coordinates and temperature in double precision, all other hydrodynamics fields in single precision
Reduce temporary memory allocation in radix sort by reusing scratch buffers

New features:

reapplySync, repeats domain update for addtional fields after calling domain.sync()

Assets 2

12 Jun 12:24

sekelle

v0.8

a7d7d74

Tree-based neighbor search

Performance enhancements:

Octree-based warp-aware neighbor searches for better performance and quasi-2D geometry support
Adaptive target particle groups for gravity traversal based on bounding volumes relative to volumes of local leaf cells. Avoids large traversal stacks.
Support for multi-level tree merges for faster octree rebalancing. Avoids a rare issue where LET updates couldn't keep up
with changing domain boundaries. (Loss of a peer rank followed by inability to scale back the octree to the global resolution in a single step)

New features:

Support for particle splitting when initializing from a checkpoint file.
Support for initialization of rectangular domains at scale for Kelvin-Helmholtz and Wind-shock
Pure N-body gravity propagator
Coupled update of neighbor counts and smoothing lengths

Minor fixes and enhancements:

IAD tau determinants with normalization factors for better over/underflow resilience
Correct observable selection handling and settings parameter when writing and restoring from file
More robust initial domain synchronization that avoids the MPI_Send limit of MPI_INT32 elements per message.
Modify signalling velocity for larger time-steps
Added divergence of velocity based minDtRho criterium to time-step control
Added acceleration-based time-step control

Assets 2

19 Jan 14:35

sekelle

v0.7

4a587ec

AV cleaning

Added artificial viscosity cleaning as a feature
Added interface to GRACKLE for radiative cooling
Improvements to Domain: perform octree updates and halo discovery on GPUs
Bugfix: added missing device synchronization points in domain and halo exchange when using GPU_DIRECT=ON

Assets 2

22 Sep 13:57

sekelle

v0.6

cb4aa8d

v0.6

Volume elements are now the default type of SPH, implemented on the GPU
Support for large-scale gravity through Ryoanji
HIP-support
GPU-direct halo exchange
Expanded test case selection
Turbulence stirring

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: sphexa-org/sphexa

HIP and Spack

Propagator library

CUDA 12.5 compatibility

Dynamic LET surface refinement and node pruning

Hierarchical block time steps

Ewald summation

Fields in 32/64 bit precision

Tree-based neighbor search

AV cleaning

v0.6