Regression: CMake 3.22+ fails to find MPI on Cray systems (reproducible on Frontier supercomputer)

Hi,

We have encountered a regression in CMake starting with version 3.22 that affects MPI detection on Cray systems, specifically on Frontier (a HPE Cray EX supercomputer at OLCF). This issue is not present in earlier versions such as 3.20 or 3.21. Below are the details and steps to reproduce the problem.

[Summary of the issue]
CMake Versions:
Not reproducible: CMake 3.20, 3.21
Reproducible: CMake 3.22, 3.27, 3.31 (latest)

Scenario 1:
CMAKE_SYSTEM_NAME=Catamount
Modules: PrgEnv-cray or PrgEnv-gnu
Not reproducible if CMAKE_SYSTEM_NAME is unset.

Scenario 2:
CMAKE_SYSTEM_NAME unset
Modules: PrgEnv-cray, craype-accel-amd-gfx90a, rocm/5.4.0
LDFLAGS=-fopenmp
Not reproducible with PrgEnv-gnu.

Both scenarios yield the same error message when attempting to configure a project with multiple subdirectories invoking find_package(MPI REQUIRED COMPONENTS C).

[Simple reproducer]
To simulate the issue on a Frontier login node:

mkdir src1 src2
cat < CMakeLists.txt
project (MY_PROJECT C)
message(STATUS “Configuring src1”)
add_subdirectory(src1)
message(STATUS “Configuring src2”)
add_subdirectory(src2)
EOF

mkdir src1/src1_subdir1 src1/src1_subdir2
cat < src1/CMakeLists.txt
add_subdirectory(src1_subdir1)
add_subdirectory(src1_subdir2)
EOF

cat < src1/src1_subdir1/CMakeLists.txt
message(STATUS “Configuring src1_subdir1”)
find_package(MPI REQUIRED COMPONENTS C)
EOF

cat < src1/src1_subdir2/CMakeLists.txt
message(STATUS “Configuring src1_subdir2”)
find_package(MPI REQUIRED COMPONENTS C)
EOF

mkdir src2/src2_subdir
cat < src2/CMakeLists.txt
add_subdirectory(src2_subdir)
EOF

cat < src2/src2_subdir/CMakeLists.txt
message(STATUS “Configuring src2_subdir”)
find_package(MPI REQUIRED COMPONENTS C)
EOF

mkdir build_scenario_1 && cd build_scenario_1
module load cmake/3.27.9
CC=cc cmake -Wno-dev -DCMAKE_SYSTEM_NAME=Catamount …

cd …

mkdir build_scenario_2 && cd build_scenario_2
module load craype-accel-amd-gfx90a rocm/5.4.0
CC=cc LDFLAGS=“-fopenmp” cmake -Wno-dev …

[Observed errors]
Both scenarios fail with the following error:

– Configuring src1
– Configuring src1_subdir1
– Found MPI_C: /opt/cray/pe/craype/2.7.31.11/bin/cc (found version “3.1”)
– Found MPI: TRUE (found version “3.1”) found components: C
– Configuring src1_subdir2
– Configuring src2
– Configuring src2_subdir
– Could NOT find MPI_C (missing: MPI_C_WORKS)
CMake Error at …/FindPackageHandleStandardArgs.cmake:230 (message):
Could NOT find MPI (missing: MPI_C_FOUND C)
Call Stack (most recent call first):
/autofs/nccs-svm1_sw/frontier/spack-envs/core-24.07/opt/gcc-7.5.0/cmake-3.27.9-pyxnvhiskwepbw5itqyipzyhhfw3yitk/share/cmake-3.27/Modules/FindPackageHandleStandardArgs.cmake:600 (_FPHSA_FAILURE_MESSAGE)
/autofs/nccs-svm1_sw/frontier/spack-envs/core-24.07/opt/gcc-7.5.0/cmake-3.27.9-pyxnvhiskwepbw5itqyipzyhhfw3yitk/share/cmake-3.27/Modules/FindMPI.cmake:1837 (find_package_handle_standard_args)
src2/src2_subdir/CMakeLists.txt:2 (find_package)

[Observations]

  • Not reproducible: /usr/bin/cmake (3.20.4) and cmake/3.21.3
  • Reproducible: cmake/3.22.2, cmake/3.27.9, cmake 3.31.0 (custom build)
  • Setting CMAKE_SYSTEM_NAME=Catamount or using specific environment modules like craype-accel-amd-gfx90a seems to exacerbate the issue.
  • If we use mpicc instead of the Cray cc compiler wrapper, the configuration of src2_subdir hangs indefinitely.
  • Likely regression introduced in 3.22 and persisting in 3.31.

[Request for assistance]
Could you please confirm if this is a known regression or provide guidance on addressing it? We believe this issue may stem from changes to FindMPI.cmake or the way subdirectory configurations propagate MPI state.

Thank you for your attention and support!

Best regards,

Danqing Wu

There were several changes to FindMPI between 3.21 and 3.22. In particular, CMake MR 6264 made a Cray-related change.

Hi Brad,

Thank you for the information and for pointing us to the CMake merge request that might have caused this regression.

Do you think the issue could be resolved in a future version of CMake, such as 3.31.2 or beyond? If not, could you suggest any modifications to the test case to ensure it works with CMake 3.22 or later?

Thanks again for your help!

Best regards,

Danqing

Hi Brad,

I have found a workaround for this issue.

In the root CMakeLists.txt of the test case, adding a redundant find_package(MPI) call avoids the problem (not required for CMake 3.21 or 3.20):
project(MY_PROJECT C)
find_package(MPI REQUIRED)
message(STATUS “Configuring src1”)
add_subdirectory(src1)
message(STATUS “Configuring src2”)
add_subdirectory(src2)

While this workaround resolves the issue for now, it would be greatly appreciated if CMake could address this regression. The FindMPI.cmake module should work consistently across different subdirectories without requiring this additional step at the root level.

PS: It seems that some symbols in my original email were unexpectedly altered in the post:
[1] “cat < CMakeLists.txt” was changed to “cat < CMakeLists.txt” (missing EOF)
[2] “…” (two dots) became “…” (three dots)

Best regards,

Danqing