Skip to content

Commit

Permalink
Merge #6274
Browse files Browse the repository at this point in the history
6274: Improve the MPI parcelport and change the zero-copy threshold to 8192 r=hkaiser a=JiakunYan

Improve the MPI parcelport:
- Replace the lock-based tag provider with an atomic variable
- Make the header message size dynamic

Since the maximum size of the dynamic header message is set to be the zero-copy serialization threshold, I also changed it from 128B to 8192B. (The message size that MPI switches from eager protocol to rendezvous protocol is usually around 8K - 64K).

Experiments: Octo-Tiger, max_level=6, SDSC Expanse, 32 nodes, 128 threads per node.
With `hpx.parcel.mpi.sendimm=0`, this PR improves the total execution time from 11.77s to 10.68s.
With `hpx.parcel.mpi.sendimm=1`, this PR improves the total execution time from ~60s to 11.72s.

Co-authored-by: Jiakun Yan <[email protected]>
  • Loading branch information
StellarBot and JiakunYan committed Jun 28, 2023
2 parents 9821469 + 3cc519f commit 8a949a7
Show file tree
Hide file tree
Showing 14 changed files with 277 additions and 271 deletions.
4 changes: 2 additions & 2 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -655,8 +655,8 @@ endif()
hpx_option(
HPX_WITH_ZERO_COPY_SERIALIZATION_THRESHOLD
STRING
"The threshold in bytes to when perform zero copy optimizations (default: 128)"

This comment has been minimized.

Copy link
@vivekd01

vivekd01 Jul 1, 2023

hpx_option(
HPX_WITH_ZERO_COPY_SERIALIZATION_THRESHOLD
STRING

  • "The threshold in bytes to when perform zero copy optimizations (default: 128)"
  • "128"
  • "The threshold in bytes to when perform zero copy optimizations (default: 8192)"
  • "8192"
       ADVANCED
     )
    After making this change, rebuild your project using CMake, and the HPX_WITH_ZERO_COPY_SERIALIZATION_THRESHOLD will be set to "8192" as the default value.
"128"
"The threshold in bytes to when perform zero copy optimizations (default: 8192)"
"8192"
ADVANCED
)
hpx_add_config_define(
Expand Down
6 changes: 0 additions & 6 deletions cmake/toolchains/Cray.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -90,12 +90,6 @@ set(HPX_PARCELPORT_LIBFABRIC_WITH_LOGGING
OFF
CACHE BOOL "Libfabric parcelport logging on/off flag"
)
set(HPX_WITH_ZERO_COPY_SERIALIZATION_THRESHOLD

This comment has been minimized.

Copy link
@vivekd01

vivekd01 Jul 1, 2023

set(HPX_WITH_ZERO_COPY_SERIALIZATION_THRESHOLD
"4096"
CACHE
STRING
"The threshold in bytes to when perform zero copy optimizations (default: 128)"
)
HPX_WITH_ZERO_COPY_SERIALIZATION_THRESHOLD variable will be set to "4096" as the default value. The CACHE option ensures that the value is stored in the CMake cache, allowing it to be persistent across multiple CMake runs.

"4096"
CACHE
STRING
"The threshold in bytes to when perform zero copy optimizations (default: 128)"
)

# We do a cross compilation here ...
set(CMAKE_CROSSCOMPILING
Expand Down
6 changes: 0 additions & 6 deletions cmake/toolchains/CrayKNL.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -88,12 +88,6 @@ set(HPX_PARCELPORT_LIBFABRIC_WITH_LOGGING
OFF
CACHE BOOL "Libfabric parcelport logging on/off flag"
)
set(HPX_WITH_ZERO_COPY_SERIALIZATION_THRESHOLD
"4096"
CACHE
STRING
"The threshold in bytes to when perform zero copy optimizations (default: 128)"
)

# Set the TBBMALLOC_PLATFORM correctly so that find_package(TBBMalloc) sets the
# right hints
Expand Down
6 changes: 0 additions & 6 deletions cmake/toolchains/CrayKNLStatic.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -72,12 +72,6 @@ set(HPX_PARCELPORT_LIBFABRIC_WITH_LOGGING
OFF
CACHE BOOL "Libfabric parcelport logging on/off flag"
)
set(HPX_WITH_ZERO_COPY_SERIALIZATION_THRESHOLD
"4096"
CACHE
STRING
"The threshold in bytes to when perform zero copy optimizations (default: 128)"
)

# Set the TBBMALLOC_PLATFORM correctly so that find_package(TBBMalloc) sets the
# right hints
Expand Down
6 changes: 0 additions & 6 deletions cmake/toolchains/CrayStatic.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -83,9 +83,3 @@ set(HPX_PARCELPORT_LIBFABRIC_WITH_LOGGING
OFF
CACHE BOOL "Libfabric parcelport logging on/off flag"
)
set(HPX_WITH_ZERO_COPY_SERIALIZATION_THRESHOLD
"4096"
CACHE
STRING
"The threshold in bytes to when perform zero copy optimizations (default: 128)"
)
2 changes: 2 additions & 0 deletions libs/core/mpi_base/include/hpx/mpi_base/mpi_environment.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,8 @@ namespace hpx::util {

using mutex_type = hpx::spinlock;

static int MPI_MAX_TAG;

private:
static mutex_type mtx_;

Expand Down
7 changes: 7 additions & 0 deletions libs/core/mpi_base/src/mpi_environment.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@
///////////////////////////////////////////////////////////////////////////////
namespace hpx::util {

int mpi_environment::MPI_MAX_TAG = 32767;

namespace detail {

bool detect_mpi_environment(
Expand Down Expand Up @@ -267,6 +269,11 @@ namespace hpx::util {

rtcfg.add_entry("hpx.parcel.mpi.rank", std::to_string(this_rank));
rtcfg.add_entry("hpx.parcel.mpi.processorname", get_processor_name());
void* max_tag_p;
int flag;
MPI_Comm_get_attr(MPI_COMM_WORLD, MPI_TAG_UB, &max_tag_p, &flag);
if (flag)
MPI_MAX_TAG = *(int*) max_tag_p;
}

std::string mpi_environment::get_processor_name()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -108,14 +108,6 @@ namespace hpx::parcelset::policies::lci {
data_ = header_buffer;
}

void reset() noexcept
{
if (data_ != nullptr)
{
free(data_);
}
}

bool valid() const noexcept
{
return data_ != nullptr && signature() == MAGIC_SIGNATURE;
Expand Down
Loading

0 comments on commit 8a949a7

Please sign in to comment.