Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use MPI_BYTE for unknown types #81

Draft
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

aprokop
Copy link
Contributor

@aprokop aprokop commented Jun 11, 2024

Fix #79.

@@ -99,10 +96,8 @@ MPI_Datatype mpi_type() {
else if constexpr (std::is_same_v<T, Kokkos::complex<double>>)
return MPI_DOUBLE_COMPLEX;

else {
static_assert(std::is_void_v<T>, "mpi_type not implemented");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to assert that T is trivially copyable at least.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking ahead, I think trivially-copyable is a problem for Trilinos FAD types - they are actually trivially-copyable, but because of how they are implemented they have non-default ctors, etc that makes C++ think they're not trivially-copyable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe if we run into this problem (a problem of success), we can make a special configure-time toggle to remove this trivially-copyable check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I'm not a fan of is that if you have type T with large sizeof(T), n * sizeof(T) may exceed INT_MAX, while storing them in a view is totally fine.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, maybe we can just send as MPI_BYTE as long as the size (MPI_Type_size) matches the extent (MPI_Type_get_extent) and the lower bound (MPI_Type_get_extent) is 0; is this the best way to check if a type is contiguous?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kokkos could offer a trait that overrides std::is_trivially_copyable so that users can tell Kokkos that they know what they are doing (e.g., not trying to copy pointers between processes).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed that also in the context of deep_copy where we are assuming that all types used in kernels are logically trivially copyable. There, we decided that this would be too big of a change to error out unless the user has specialized such a trait.

@masterleinad
Copy link
Collaborator

To actually fix #79 you would also need to make sure that you scale the number of bytes to send and receive accordingly, though.

@aprokop aprokop marked this pull request as draft June 11, 2024 19:45
@aprokop
Copy link
Contributor Author

aprokop commented Jun 11, 2024

To actually fix #79 you would also need to make sure that you scale the number of bytes to send and receive accordingly, though.

I feel sheepish for forgetting this.

@@ -99,6 +96,9 @@ MPI_Datatype mpi_type() {
else if constexpr (std::is_same_v<T, Kokkos::complex<double>>)
return MPI_DOUBLE_COMPLEX;

else if constexpr (std::is_trivially_copyable_v<T>)
return MPI_BYTE;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that this fix alone solves your problem. I'm missing the part where we get the size of T. In https://github.com/kokkos/kokkos-comm/blob/develop/src/impl/KokkosComm_send.hpp#L72 we only get the type (MPI_BYTE here) and take the span, not the span*sizeof(T). Maybe mpi_type() should provide a count multiplier as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, @masterleinad identified this problem. I'm not sure what's the best way to proceed here. Your suggestion is one of the possible variants.

@cwpearson
Copy link
Collaborator

cwpearson commented Jun 18, 2024

What if the fallback case was just to construct, register, and cache a contiguous MPI derived datatype that is sizeof T in a static variable.

@aprokop
Copy link
Contributor Author

aprokop commented Jun 18, 2024

What if the fallback case was just to construct, register, and cache a contiguous MPI derived datatype that is sizeof T in a static variable.

That may work. I'm not familiar with it. Could you post a snippet of code to demonstrate what you are thinking?

@cwpearson
Copy link
Collaborator

cwpearson commented Jun 18, 2024

Untested:

else /*constexpr*/ {
  static MPI_Datatype cached = MPI_DATATYPE_NULL;
  if (MPI_DATATYPE_NULL == cached) {
    MPI_Type_contiguous(sizeof(T), MPI_BYTE, &cached);
    MPI_Type_commit(&cached);
  }
  return cached;
}

each call to mpi_type with a new T that reaches the fallback with a different T will have a different cached variable. Each new T will be created and committed once, lazily, the first time it is seen. Otherwise it will be a quick (?) runtime check.

@aprokop
Copy link
Contributor Author

aprokop commented Jun 18, 2024

each call to mpi_type with a new T that reaches the fallback with a different T will have a different cached variable

Oh, that's a cool idea! This sounds like it could work well. Maybe create a templated function to create mpi types, so that users could specialize it on their own. With the default being the two MPI calls you posted.

@cwpearson
Copy link
Collaborator

cwpearson commented Jun 18, 2024

Yeah it would be great if this could be exposed to the user to override it. I think we'd have to refactor this to be a SFINAE-style implementation rather than if constexpr then?

@aprokop
Copy link
Contributor Author

aprokop commented Jun 18, 2024

Yeah it would be great if this could be exposed to the user to override it. I think we'd have to refactor this to be a SFINAE-style implementation rather than if constexpr then?

Yeah, probably just a templated function with the default type creation and specializations.

@devreal
Copy link
Collaborator

devreal commented Jun 18, 2024

Who frees the temporary type? It should probably be put into a list of types that get destroyed at the end...

@aprokop
Copy link
Contributor Author

aprokop commented Jun 18, 2024

Can make the static variable a holder class that frees on destruction. Nevermind, it would be called too late, after MPI was finalized.

@cwpearson
Copy link
Collaborator

Are we actually required to call MPI_Type_free?

@aprokop
Copy link
Contributor Author

aprokop commented Jun 20, 2024

Are we actually required to call MPI_Type_free?

Some (like this presentation, slide 4) state that "You are not required to use it or deallocate it, but it is recommended (there may be a limit)".

I didn't see anything in the standard saying that it is required.

@devreal
Copy link
Collaborator

devreal commented Jun 21, 2024

It's not required. Datatypes are a funny thing in MPI because they can persist between Sessions (mostly for historically reasons, because they are not bound to any Session). Having said that, KokkosComm should strive to exit cleanly with all resources freed. This makes it easier to use leak detection tools and is just good practice.

@cwpearson
Copy link
Collaborator

This ties in to the init / finalize discussions I think. We have to have the opportunity to do something before / when MPI_Finalize gets called.

@devreal
Copy link
Collaborator

devreal commented Jun 24, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Convert user types to MPI_BYTE
4 participants