You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On AMD MI250X (64 GB of GPU memory per GCD), the MG for an 80c160 lattice comfortably fits on two nodes with four GPUs each (eight GCDs each). This means that the 44 GB gauge configuration has to be read by 16 MPI tasks. This means that every MPI task reads around 2.7 GB (2949120086 bytes), which exceeds the silly 2^31 = 2147483648 byte limit of MPI I/O.
This means that we need to split reads and writes exceeding this limit into multiple I/O operations.
The text was updated successfully, but these errors were encountered:
On AMD MI250X (64 GB of GPU memory per GCD), the MG for an 80c160 lattice comfortably fits on two nodes with four GPUs each (eight GCDs each). This means that the 44 GB gauge configuration has to be read by 16 MPI tasks. This means that every MPI task reads around 2.7 GB (2949120086 bytes), which exceeds the silly 2^31 = 2147483648 byte limit of MPI I/O.
This means that we need to split reads and writes exceeding this limit into multiple I/O operations.
The text was updated successfully, but these errors were encountered: