Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI shared memory #14

Open
wants to merge 6 commits into
base: unstable
Choose a base branch
from
Open

MPI shared memory #14

wants to merge 6 commits into from

Conversation

hmenke
Copy link
Member

@hmenke hmenke commented Sep 26, 2023

All tests were run with multiple nodes and different number of slots on each node:

$ cat hostfile 
pcscqm04 slots=4
pcscqm05 slots=3
pcscqm06 slots=5
pcscqm07 slots=2
$ mpirun -hostfile ./hostfile build/test/c++/mpi_window

Some ideas:

  • MPI allocator

    Similar to std::allocator implement a shared_allocator and a distributed_shared_allocator, such that one can use e.g. std::vector<double, mpi::shared_allocator<double>>

    Questions:

    • Should this be in mpi or in nda?
    • For shared memory one can probably use a best guess and use split_shared() on the default communicator, but for distributed shared memory there needs to be internode communication and that is not easily guessed from the default communicator. Maybe a global hash table with allocation information is needed?
    • For shared memory race conditions must be prevented somehow.
    • On top of that, for distributed shared memory access must be fenced and broadcasted between nodes. That's not so easy to abstract away.

@hmenke hmenke force-pushed the shm branch 3 times, most recently from 92e2e2b to 1738e29 Compare October 4, 2023 07:41
@hmenke hmenke changed the title WIP: MPI shared memory MPI shared memory Oct 9, 2023
@hmenke hmenke marked this pull request as ready for review October 9, 2023 10:00
Copy link
Member Author

@hmenke hmenke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this non-MPI compatible?

This adds a new abstraction for MPI_Group to be able to use the
post-start-complete-wait RMA cycle. Also adds documentation and more tests.

Co-authored-by: Mohamed Aziz Bellaaj <[email protected]>
@Wentzell Wentzell requested a review from Thoemi09 February 25, 2025 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant