I have recently encountered a project that hard-codes the values passed through MPIEXEC_NUMPROC_FLAG and the OMP_NUM_THREADS so that they add-up to $(nproc). But on the packaging side, -j 2 is automatically added, thus oversubscribing the CPU cores. Is there an efficient way to define this execution dynamically? Maybe with generator expressions, or some internal ctest logic?
if (USE_MPI)
if (TEST_MPI_RANKS STREQUAL "auto")
include(ProcessorCount)
ProcessorCount(nproc)
math(EXPR num_ranks "(${nproc}+${TEST_OMP_THREADS}-1)/${TEST_OMP_THREADS}"
)# get 1/$TEST_OMP_THREADS the number of procs (rounded up)
else ()
set(num_ranks ${TEST_MPI_RANKS})
endif ()
message(
"Tests will run with ${num_ranks} MPI ranks and ${TEST_OMP_THREADS} OpenMP threads each"
)
endif ()
I do have several projects that run MPI tests where I have to be careful about the number of MPI workers because some of the tests require a minimum number of workers or they’ll fail.
I use resource_lock test property on each of the MPI tests. That way I know each MPI test will run by itself. You could also use the run_serial property.
One confusing thing is that in a lot of the places it is mentioned ctest_test which is part of the cdash. It is unclear what properties can be used directly and not. @scivision have you used cdash, and is it worth it? Is it worth implementing even if the results are not uploaded.
What I would like to ultimately do is run multiple (parallel) tests in parallel since I have noticed that the tests do not scale very well, and there are various short execution tests that can be finished in parallel. For that a mechanism to dynamically define the number of MPI processes is necessary. I know that part of the mechanism is possible since it is possible to dynamically define tests (e.g. catch2), but I haven’t investigated how to adjust that interface for MPI+OpenMP tests.