Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide KokkosComm::initialize/finalize #88

Closed
wants to merge 13 commits into from

Conversation

dssgabriel
Copy link
Collaborator

@dssgabriel dssgabriel commented Jun 14, 2024

This PR implements a very basic initialization and finalization for KokkosComm (closes #82).

Users should now only call KokkosComm::initialize and KokkosComm::finalize, no more manual MPI and Kokkos initialization. Using these functions ensures MPI is correctly initialized/finalized (with multiple thread support) before Kokkos is initialized/finalized.

To-do before merging:

  • add documentation on these functions
  • make it clear to users that they should only call KokkosComm when initializing/finalizing a Kokkos + MPI application

@dssgabriel dssgabriel added the enhancement New feature or request label Jun 14, 2024
@dssgabriel dssgabriel self-assigned this Jun 14, 2024
Copy link
Member

@cedricchevalier19 cedricchevalier19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts, and be precise in the documentation.

src/KokkosComm.hpp Outdated Show resolved Hide resolved
src/KokkosComm.hpp Outdated Show resolved Hide resolved
src/KokkosComm.hpp Outdated Show resolved Hide resolved
src/KokkosComm.hpp Outdated Show resolved Hide resolved
Copy link
Contributor

@aprokop aprokop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to have KokkosComm::ScopeGuard that is similar to Kokkos::ScopeGuard in addition to finalize and initialize. That way, a user won't have to remember to call finalize.

@dssgabriel
Copy link
Collaborator Author

I think it would be nice to have KokkosComm::ScopeGuard that is similar to Kokkos::ScopeGuard

Should this and other utilities (e.g. is_initialized) be added here or in another PR?

@aprokop
Copy link
Contributor

aprokop commented Jun 14, 2024

Should this and other utilities (e.g. is_initialized) be added here or in another PR?

Easier in a different PR to keep the PR size small.

src/KokkosComm.hpp Outdated Show resolved Hide resolved
src/KokkosComm.hpp Outdated Show resolved Hide resolved
src/KokkosComm.hpp Outdated Show resolved Hide resolved
src/KokkosComm.hpp Outdated Show resolved Hide resolved
@dssgabriel
Copy link
Collaborator Author

Last thing we need to think about is: do we abort if we failed to get the desired thread support level?

src/KokkosComm.hpp Outdated Show resolved Hide resolved
MPI_Initialized(&flag);
// Eagerly abort if MPI has already been initialized
if (0 != flag) {
MPI_Abort(MPI_COMM_WORLD, -1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not allow MPI to be initialized already as long as the requested thread-level matches?

src/KokkosComm.hpp Outdated Show resolved Hide resolved
Comment on lines +53 to +56
// Forbid calling this function if Kokkos has already been initialized
if (Kokkos::is_initialized()) {
std::abort();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this, Kokkos::initialize() will do that for you.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand. We want to abort if Kokkos::initialize has already been called. Why try to initialize MPI and Kokkos if we can eagerly prove we got illegally called?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Kokkos will do that for you, I believe. I'm not sure MPI will.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it weird that a subproject subsumes the initialization of the main project. I should be able to initialize main Kokkos and KokkosComm from my application. The KokkosComm initializer should initialize main Kokkos if it's not already initialized and otherwise handle that gracefully. Same with MPI above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand your words about "handle that gracefully". Kokkos initialization will abort if it has been initialized/finalized previously:
https://github.com/kokkos/kokkos/blob/892e13c8c49fa9d4ef9f8dcb69f90d525a6baa58/core/src/impl/Kokkos_Core.cpp#L1056-L1061.
The code inside KokkosComm::initialize() checking Kokkos initialization/finalization does not add anything, unless you decide to skip initializing Kokkos, which I argue is a wrong thing to do.

Copy link
Contributor

@aprokop aprokop Jun 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kokkos::initialize does some special checks for MPI

Are you sure about this? I only see detection of the local MPI rank based on environment variables, but not calls to any MPI routines.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kokkos docs mention one should call MPI_Init before Kokkos::initialize: https://kokkos.org/kokkos-core-wiki/ProgrammingGuide/Initialization.html?highlight=initialize#initialization

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not prevent users from doing things manually.

You are telling people not to initialize your project if they initialize its dependencies manually. That is flawed.

However, the herein proposed API provides a straightforward and foolproof way of initializing/finalizing their KokkosComm application.

I am not arguing to take that away. If users want to shift all initialization to KokkosComm they can do so.

Moreover, I don’t understand why we would let people write fundamentally wrong code and silence it.

Why is that fundamentally wrong? You can still bail if they did it wrong but there is no reason to always bail even if the configuration they chose is correct (e.g., MPI was initialized with the same thread-level.)

Finally, I'd like to point out that this is only a first step towards KokkosComm initialization/finalization. Once #68 gets merged, we will have an isolated environment (thanks to MPI Sessions) that lets us lift most of — if not all — the constraints required for now.

Then let's not tell them one thing now and another then. We can get close to the semantics we will have then and users will not have to change their code. Keep in mind that what you are proposing now forces users to change their code such that later when KokkosComm uses Sessions the WPM will be broken and no communication outside of KokkosComm will work anymore. Or KokkosComm will be broken because users stuck to manually initializing MPI to get the WPM. We can do better than that.

This PR does not have the state, only free functions. Introducing the state has its own challenges, I think.

The approach I would use is an inline accessor to avoid explicit global variables that need to go into a compilation unit. In some header file:

inline bool& get_state_ref() {
  static bool state = false;
  return state;
}

Copy link
Collaborator Author

@dssgabriel dssgabriel Jun 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR aimed at unifying two calls — MPI_{Init,Init_thread} and Kokkos::initialize — into a single one.

You are telling people not to initialize your project if they initialize its dependencies manually. That is flawed.

It means that instead of having two calls, users may now need a third one.

Why is that fundamentally wrong? You can still bail if they did it wrong but there is no reason to always bail even if the configuration they chose is correct (e.g., MPI was initialized with the same thread-level.)

OK, that's fair.
Once MPI has been initialized, is it possible to retrieve the current thread support level? I can't find anything in OpenMPI's docs, but this is needed to check that the users' requirements are met in KokkosComm::initialize and error out if needed, as you've suggested.

Keep in mind that what you are proposing now forces users to change their code such that later when KokkosComm uses Sessions the WPM will be broken and no communication outside of KokkosComm will work anymore.

Once #68 gets merged, existing code will most likely have to change anyway. E.g. all of our tests use MPI_COMM_WORLD, which won't be valid in the SPM. Besides, we will want to provide some object (KokkosComm::Context or Handle) that wraps the MPI_Comm, Kokkos::ExecutionSpace, and maybe the associated stream if comms happen on a device. Our API will then use this object instead of a raw MPI_Comm.
All of our current interfaces are still subject to change. IMO it's ok if we tweak the semantics for now, although I agree it's best to get this PR closer to what we're aiming for in the future. 👍

This PR does not have the state, only free functions. Introducing the state has its own challenges, I think.

The approach I would use is an inline accessor to avoid explicit global variables that need to go into a compilation unit.

I don't think we need to carry any state for the moment, we're only calling MPI_Init_thread and Kokkos::initialize. Adding KokkosComm::is_{initiliazed,finalized} should be enough for now I think.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once MPI has been initialized, is it possible to retrieve the current thread support level?

The function you're looking for is MPI_QUERY_THREAD :)

It means that instead of having two calls, users may now need a third one.

I don't see the problem, esp if parts of my application rely on the WPM anyway and I have to ensure that the WPM is initialized. I cannot rely on KokkosComm to do that if I know that at some point the behavior will change.

Anyway, maybe this PR isn't ready yet and needs to be tied to #68. Let's discuss this during the Monday call, I will try to join (the MPI Forum potentially conflicts with the call).

Comment on lines +82 to +85
// Forbid calling this function if Kokkos has already been finalized or isn't yet initialized
if (Kokkos::is_finalized() || !Kokkos::is_initialized()) {
MPI_Abort(MPI_COMM_WORLD, -1);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need it, Kokkos::finalize() will do that for you.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same but for finalization. Also, this lets us cleanly abort on all ranks.

@dssgabriel dssgabriel marked this pull request as draft June 17, 2024 16:39
@cedricchevalier19
Copy link
Member

It was decided not to continue and to provide stubs for users to initialize correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide a KokkosCommInitialize
4 participants