Skip to content

Commit

Permalink
UCX: fix problem when doing multiple session init/finalize
Browse files Browse the repository at this point in the history
Another case where the MCA system gets shut down if no more "instances" are present,
but unless --enable-mca-dso is used, the UCX common has retained info about memory
allocation, etc. that is no longer correct.

related to open-mpi#12909

Signed-off-by: Howard Pritchard <[email protected]>
  • Loading branch information
hppritcha committed Nov 5, 2024
1 parent 25feb3b commit db92ff5
Showing 1 changed file with 10 additions and 1 deletion.
11 changes: 10 additions & 1 deletion opal/mca/common/ucx/common_ucx.c
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
* Copyright (c) 2019 Intel, Inc. All rights reserved.
* Copyright (c) 2019 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2021 Triad National Security, LLC. All rights
* Copyright (c) 2021-2024 Triad National Security, LLC. All rights
* reserved.
* Copyright (c) 2022 Google, LLC. All rights reserved.
* Copyright (c) 2022 IBM Corporation. All rights reserved.
Expand Down Expand Up @@ -107,6 +107,10 @@ OPAL_DECLSPEC void opal_common_ucx_mca_var_register(const mca_base_component_t *
// cleans up the MCA vars. This will cause the string to go
// out of scope unless we place the pointer to it on the heap.
opal_common_ucx.tls = (char **) malloc(sizeof(char *));
*opal_common_ucx.tls = NULL;
}

if (NULL == *opal_common_ucx.tls) {
*opal_common_ucx.tls = strdup(default_tls);
}

Expand All @@ -122,8 +126,13 @@ OPAL_DECLSPEC void opal_common_ucx_mca_var_register(const mca_base_component_t *

if (NULL == opal_common_ucx.devices) {
opal_common_ucx.devices = (char**) malloc(sizeof(char*));
*opal_common_ucx.devices = NULL;
}

if (NULL == *opal_common_ucx.devices) {
*opal_common_ucx.devices = strdup(default_devices);
}

devices_index = mca_base_var_register(
"opal", "opal_common", "ucx", "devices",
"List of device driver pattern names, which, if supported by UCX, will "
Expand Down

0 comments on commit db92ff5

Please sign in to comment.