-
Notifications
You must be signed in to change notification settings - Fork 861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update collective framework count/disp arrays for bigcount #12621
Conversation
9aa2745
to
3e9c622
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This touches more code than I thought it would but I like the mechanism. Just some nitpicks.
ompi/util/count_disp_array.h
Outdated
#define OMPI_COUNT_ARRAY_INIT(array, data) \ | ||
do { \ | ||
if (sizeof(*data) == sizeof(int)) { \ | ||
ompi_count_array_init(array, (const int *) data); \ | ||
} else if (sizeof(*data) == sizeof(size_t)) { \ | ||
ompi_count_array_init_c(array, (const size_t *) data); \ | ||
} \ | ||
} while (0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parens around macro parameters:
#define OMPI_COUNT_ARRAY_INIT(array, data) \ | |
do { \ | |
if (sizeof(*data) == sizeof(int)) { \ | |
ompi_count_array_init(array, (const int *) data); \ | |
} else if (sizeof(*data) == sizeof(size_t)) { \ | |
ompi_count_array_init_c(array, (const size_t *) data); \ | |
} \ | |
} while (0) | |
#define OMPI_COUNT_ARRAY_INIT(array, data) \ | |
do { \ | |
if (sizeof(*(data)) == sizeof(int)) { \ | |
ompi_count_array_init((array), (const int *) (data)); \ | |
} else if (sizeof(*(data)) == sizeof(size_t)) { \ | |
ompi_count_array_init_c((array), (const size_t *) (data)); \ | |
} \ | |
} while (0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a full review. Most points are minor (I hope).
@@ -121,7 +121,8 @@ int mca_coll_hcoll_allgatherv(const void *sbuf, int scount, | |||
mca_coll_hcoll_module_t *hcoll_module = (mca_coll_hcoll_module_t*)module; | |||
stype = ompi_dtype_2_hcoll_dtype(sdtype, NO_DERIVED); | |||
rtype = ompi_dtype_2_hcoll_dtype(rdtype, NO_DERIVED); | |||
if (OPAL_UNLIKELY(HCOL_DTE_IS_ZERO(stype) || HCOL_DTE_IS_ZERO(rtype))) { | |||
if (OPAL_UNLIKELY(HCOL_DTE_IS_ZERO(stype) || HCOL_DTE_IS_ZERO(rtype) | |||
|| ompi_count_array_is_64bit(rcount))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably something that someone from @janjust may want to look at: should hcoll disqualify itself if the count argument is 64bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@janjust Ping
/* TODO:BIGCOUNT: Remove tehese temporaries once ompi_datatype is updated for bigcount */ | ||
int *tmp_rcounts = malloc(sizeof(int) * size); | ||
int *tmp_disps = malloc(sizeof(int) * size); | ||
if (NULL == tmp_rcounts || NULL == tmp_disps) { | ||
err = OMPI_ERR_OUT_OF_RESOURCE; | ||
goto exit; | ||
} | ||
for (i = 0; i < size; ++i) { | ||
tmp_rcounts[i] = (int) ompi_count_array_get(rcounts, i); | ||
tmp_disps[i] = (int) ompi_disp_array_get(disps, i); | ||
} | ||
ompi_datatype_create_indexed(size,tmp_rcounts,tmp_disps,rdtype,&ndtype); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a PR for the datatype interface? I wonder whether we can reuse the array structure there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a draft PR: #12351. It seems like these structs could be used there as well.
a551039
to
2565796
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -121,7 +121,8 @@ int mca_coll_hcoll_allgatherv(const void *sbuf, int scount, | |||
mca_coll_hcoll_module_t *hcoll_module = (mca_coll_hcoll_module_t*)module; | |||
stype = ompi_dtype_2_hcoll_dtype(sdtype, NO_DERIVED); | |||
rtype = ompi_dtype_2_hcoll_dtype(rdtype, NO_DERIVED); | |||
if (OPAL_UNLIKELY(HCOL_DTE_IS_ZERO(stype) || HCOL_DTE_IS_ZERO(rtype))) { | |||
if (OPAL_UNLIKELY(HCOL_DTE_IS_ZERO(stype) || HCOL_DTE_IS_ZERO(rtype) | |||
|| ompi_count_array_is_64bit(rcount))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@janjust Ping
I believe hcoll is being replaced by UCC so i don't think there's going to be an investement of time in embiggening hcoll unless it already is. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks legit. I have 2 comments:
- I don't really like the new ompi_count_array_t and ompi_disp_array_t structs because they misalign everything due to their sizes (enum, aka int, + pointer). Taking in account that we only deal with 32 or 64 bits arrays, we could taken advantage of their implicit alignment (to 4 or 8 bytes), and repurpose the last bit as a 32/64 indicator.
typedef struct ompi_count_array_t {
union {
const int *int_array;
const size_t *size_t_array;
} data;
} ompi_count_array_t;
static inline void ompi_count_array_init(ompi_count_array_t *array, const int *data)
{
array->data.int_array = data | 0x1;
}
static inline void ompi_count_array_init_c(ompi_count_array_t *array, const size_t *data)
{
array->data.size_t_array = data;
}
static inline bool ompi_count_array_is_64bit(ompi_count_array_t *array)
{
return !(array->data.size_t_array & 0x1);
}
static inline const void *ompi_count_array_ptr(ompi_count_array_t *array)
{
return array->data.size_t_array;
}
/* Get a count in the array at index i */
static inline size_t ompi_count_array_get(ompi_count_array_t *array, size_t i)
{
if (OPAL_LIKELY(array->data.int_t_array & 0x1)) {
return array->data.int_array[i];
}
return array->data.size_t_array[i];
}
- Why do you call the union field
data
but the corresponding accessor is calledompi_count_array_ptr
(basicallyptr
) ?
well the suggested code doesn't even compile but I'll see about using the idea
|
ompi/util/count_disp_array.h
Outdated
static inline const void *ompi_disp_array_ptr(ompi_disp_array_t *array) | ||
{ | ||
if (OPAL_LIKELY((uint64_t)array->data.int_array & 0x1L)){ | ||
return (const void *)((uint64_t)array->data.int_array & ~0x1L); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small suggestion: I'd prefer intptr_t
instead of uint64_t
. long
(as in ~0x1L
) is not guaranteed to be 64bit on all machines but the promotion to (signed) intptr_t
in two-compliment will guarantee that the sign is propagated and we end up with a correct 64bit mask. I'm not sure this will work with unsigned 64bit. Also, intptr_t
is guaranteed to be large enough to hold a pointer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks better. One thing I'm concerned about is the use of pointers to ompi_count_array_t and ompi_disp_array_t.
First, because it suggests the possible use of a pointer to an object located outside the scope of the collective. This is not true for persistent or nonblocking collectives, as this object must be available for the entire duration of the operations. I know in the current code this is not really a pointer to an object, but the user pointer disguised, but other casual readers might be unaware of this.
Moreover, if we change the prototype to the internal functions to use ompi_count_array_t and ompi_disp_array_t, in addition to a cleaner code, it will also prevent the use of these values directly (without going through the accessors that returns the pointers to the arrays).
This updates the coll framework functions using count/displacement arrays to support bigcount. Instead of directly using pointers to bigcount/non-bigcount type arrays, this adds special descriptor types, ompi_count_array_t and ompi_disp_array_t, that must be used through inline accessor functions to ensure use of the correct type, whether bigcount or non-bigcount. Internally, these descriptors are typedefs of intptr_t and hold both a pointer and flag indicating the pointer type. Co-authored-by: Howard Pritchard <[email protected]> Signed-off-by: Jake Tronge <[email protected]>
3c1ba58
to
3312228
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for the work 👍
This updates the collective framework functions using count/displacement arrays to support bigcount. Instead of directly using pointers to bigcount and non-bigcount arrays, this adds special descriptor types, ompi_count_array and ompi_disp_array, that internally hold a union of either type. Collective components can now access count and displacement values through inline get functions on the descriptors, allowing use of both bigcount/non-bigcount arrays, depending on how the descriptors were initialized.