HPE-Cray environment and OpenMP linking issue

I use CMake 3.20.2, CPE/22/11 (cce/15.0.2) and ROCm 5.2.3.

Assuming the following CMake script:

cmake_minimum_required(VERSION 3.20)
project("test" CXX)
find_package(OpenMP REQUIRED)
add_executable(test_executable test.cc)
target_link_libraries(test_executable
                      PUBLIC OpenMP::OpenMP_CXX)

Under a recent HPE-Cray supercomputer, using CMake to compile the following code on AMD’s MI 250 GPUs (gfx90a):

int main() {
#pragma omp                  target
#pragma omp teams distribute parallel for
    for(int i = 0; i < 10; ++i) {
    }
}

would require to load some modules like so:

module load craype-accel-amd-gfx90a
module load craype-x86-trento
module load PrgEnv-cray

and would also require that you specify, in a toolchain file or directly as an environment variable, that you seek to use the Cray compiler (cce):

export CXX=CC

All that being done, after the CMake configuration ends and I start building, I get a linker error of the following kind:

d.lld: error: undefined symbol: __tgt_target_kernel
>>> referenced by test.cc
>>>               CMakeFiles/test_executable.dir/test.cc.o:(main)

ld.lld: error: undefined symbol: __kmpc_fork_teams
>>> referenced by test.cc
>>>               CMakeFiles/test_executable.dir/test.cc.o:(__omp_offloading_84f0b5a2_49018856_main_l12)

ld.lld: error: undefined symbol: __kmpc_for_static_init_4
>>> referenced by test.cc
>>>               CMakeFiles/test_executable.dir/test.cc.o:(__omp_offloading_84f0b5a2_49018856_main_l12_cray$mt$p0001)
>>> referenced by test.cc
>>>               CMakeFiles/test_executable.dir/test.cc.o:(__omp_offloading_84f0b5a2_49018856_main_l12_cray$mt$p0002)

ld.lld: error: undefined symbol: _cray$mt_kmpc_fork_call_with_flags
>>> referenced by test.cc
>>>               CMakeFiles/test_executable.dir/test.cc.o:(__omp_offloading_84f0b5a2_49018856_main_l12_cray$mt$p0001)

ld.lld: error: undefined symbol: __kmpc_for_static_fini
>>> referenced by test.cc
>>>               CMakeFiles/test_executable.dir/test.cc.o:(__omp_offloading_84f0b5a2_49018856_main_l12_cray$mt$p0001)
>>> referenced by test.cc
>>>               CMakeFiles/test_executable.dir/test.cc.o:(__omp_offloading_84f0b5a2_49018856_main_l12_cray$mt$p0002)

ld.lld: error: undefined symbol: __tgt_register_requires
>>> referenced by test.cc
>>>               CMakeFiles/test_executable.dir/test.cc.o:(.omp_offloading.requires_reg)
clang-15: error: linker command failed with exit code 1 (use -v to see invocation)

If I change the test case like that (no offloading anymore):

int main() {
#pragma omp parallel for
    for(int i = 0; i < 10; ++i) {
    }
}

I get this linker error:

ld.lld: error: undefined symbol: _cray$mt_kmpc_fork_call_with_flags
>>> referenced by test.cc
>>>               CMakeFiles/test_executable.dir/test.cc.o:(main)

ld.lld: error: undefined symbol: __kmpc_for_static_init_4
>>> referenced by test.cc
>>>               CMakeFiles/test_executable.dir/test.cc.o:(main_cray$mt$p0001)

ld.lld: error: undefined symbol: __kmpc_for_static_fini
>>> referenced by test.cc
>>>               CMakeFiles/test_executable.dir/test.cc.o:(main_cray$mt$p0001)

and if I unload the craype-accel-amd-gfx90a module it works fine, but of course, without offloading.

As I would like to use the OpenMP offloading capabilities this is quite problematic.

It seems that CMake does not produce a build script that links with the HPE-Cray OpenMP runtime if the craype-accel-amd-gfx90a:

Here is the output of ninja -v (I tried both ninja and make and it resulted in the same error).

In the case where craype-accel-amd-gfx90a is not loaded, note the presence of explicit linking of the OpenMP runtime libraries:

[1/2] /opt/cray/pe/craype/2.7.19/bin/CC   -fopenmp -MD -MT CMakeFiles/test_executable.dir/test.cc.o -MF CMakeFiles/test_executable.dir/test.cc.o.d -o CMakeFiles/test_executable.dir/test.cc.o -c ../test.cc
[2/2] : && /opt/cray/pe/craype/2.7.19/bin/CC   CMakeFiles/test_executable.dir/test.cc.o -o test_executable  /opt/cray/pe/libsci/22.11.1.2/CRAY/9.0/x86_64/lib/libsci_cray_mpi_mp.so  /opt/cray/pe/libsci/22.11.1.2/CRAY/9.0/x86_64/lib/libsci_cray_mp.so  /opt/cray/pe/cce/15.0.0/cce/x86_64/lib/libcraymp.so && :

In the case where craype-accel-amd-gfx90a is loaded, note the absence of -fopenmp or explicit linking of the OpenMP runtime libraries:

[1/2] /opt/cray/pe/craype/2.7.19/bin/CC   -fopenmp -MD -MT CMakeFiles/test_executable.dir/test.cc.o -MF CMakeFiles/test_executable.dir/test.cc.o.d -o CMakeFiles/test_executable.dir/test.cc.o -c ../test.cc
[2/2] : && /opt/cray/pe/craype/2.7.19/bin/CC   CMakeFiles/test_executable.dir/test.cc.o -o test_executable   && :

The compilation step is unchanged, the linking step is wrong when craype-accel-amd-gfx90a is loaded.

I could workaround the issue by adding the following to the CMake script:

target_link_options(test_executable
                    PUBLIC -fopenmp)

Which gives:

[1/2] /opt/cray/pe/craype/2.7.19/bin/CC   -fopenmp -MD -MT CMakeFiles/test_executable.dir/test.cc.o -MF CMakeFiles/test_executable.dir/test.cc.o.d -o CMakeFiles/test_executable.dir/test.cc.o -c ../test.cc
[2/2] : && /opt/cray/pe/craype/2.7.19/bin/CC  -fopenmp CMakeFiles/test_executable.dir/test.cc.o -o test_executable   && :

All CMake configuration steps are similar (regardless of the module’s presence):

-- The CXX compiler identification is Clang 15.0.2
-- Cray Programming Environment 2.7.19 CXX
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/cray/pe/craype/2.7.19/bin/CC - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenMP_CXX: -fopenmp (found version "5.0") 
-- Found OpenMP: TRUE (found version "5.0")  
-- Configuring done
-- Generating done
-- Build files have been written to: XXXX/build

It looks like CMake only adds it to the linker for Fujitsu and IntelLLVM. I suspect that the Cray compiler needs to also do that. Could you please file an issue?