You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In file ompi/oshmem/mca/spml/ucx/spml_ucx.c
around line 1140,
/* Check if we have an idle context to reuse */
SHMEM_MUTEX_LOCK(mca_spml_ucx.internal_mutex);
for (i = 0; i < idle_array->ctxs_count; i++) {
if (idle_array->ctxs[i]->options & options) {
ucx_ctx = idle_array->ctxs[i];
_ctx_remove(idle_array, ucx_ctx, i);
break;
}
}
We have the above code that attempts to reuse idle contexts from an array of idle contexts. However, the condition (idle_array->ctxs[i]->options & options at line 1140 suggests that an idle context with option 0 will never get reused.
If you modify the OSHMEM code to print out ctxs_count which is the length of the array idle_array, and run the following simple program,
#include <shmem.h>
int main() {
shmem_init();
for (int i = 0; i < 10; i++) {
shmem_ctx_t ctx;
shmem_ctx_create(0, &ctx);
shmem_ctx_destroy(ctx);
}
shmem_finalize();
}
you will observe that ctxs_count grows from 0 to 9, indicating the idle contexts are not getting reused and the idle array explodes. I believe this is not the expected behavior and the fix would be as simple as changing the condition from idle_array->ctxs[i]->options & options to idle_array->ctxs[i]->options == options.
Moreover, the current code can lead to correctness issue as well because it can potentially assign a more restrictive context as a context with less restrictive option. Consider the following program.
With the current code, a context configured with both SHMEM_CTX_NOSTORE and SHMEM_CTX_PRIVATE will be assigned as a context configured with only SHMEM_CTX_NOSTORE. This will lead to correctness issue as ctx2 may be shared. In fact, the current code crushes when the above program runs. Changing & to == fixes this issue as well.
It is clearly stated in section 9.4.1 of the OpenSHMEM 1.4 specification that multiple options can be combined with a bitwise OR operation.
The text was updated successfully, but these errors were encountered:
In file
ompi/oshmem/mca/spml/ucx/spml_ucx.c
around line 1140,
We have the above code that attempts to reuse idle contexts from an array of idle contexts. However, the condition
(idle_array->ctxs[i]->options & options
at line 1140 suggests that an idle context with option 0 will never get reused.If you modify the OSHMEM code to print out
ctxs_count
which is the length of the arrayidle_array
, and run the following simple program,you will observe that
ctxs_count
grows from 0 to 9, indicating the idle contexts are not getting reused and the idle array explodes. I believe this is not the expected behavior and the fix would be as simple as changing the condition fromidle_array->ctxs[i]->options & options
toidle_array->ctxs[i]->options == options
.Moreover, the current code can lead to correctness issue as well because it can potentially assign a more restrictive context as a context with less restrictive option. Consider the following program.
With the current code, a context configured with both
SHMEM_CTX_NOSTORE
andSHMEM_CTX_PRIVATE
will be assigned as a context configured with onlySHMEM_CTX_NOSTORE
. This will lead to correctness issue as ctx2 may be shared. In fact, the current code crushes when the above program runs. Changing&
to==
fixes this issue as well.It is clearly stated in section 9.4.1 of the OpenSHMEM 1.4 specification that multiple options can be combined with a bitwise OR operation.
The text was updated successfully, but these errors were encountered: