Hi,
cmake (26.5 and also 30.3) seem to fail searching for the cuda toolkit when it is symlinked by ubuntu’s update-alternatives
. More precisely, I have this sample project cmakelists.txt from CLion:
cmake_minimum_required(VERSION 3.24)
project(xmimir CXX CUDA)
set(CMAKE_CUDA_STANDARD 20)
message(STATUS "CMAKE_CUDA_COMPILER: ${CMAKE_CUDA_COMPILER}")
message(STATUS "CUDAToolkit_ROOT: ${CUDAToolkit_ROOT}")
find_package(Threads REQUIRED)
find_package(CUDAToolkit REQUIRED)
add_library(xmimir STATIC library.cu)
set_target_properties(
xmimir
PROPERTIES
CUDA_SEPARABLE_COMPILATION ON
)
At work we have an IT setup in which /usr/local is mounted with nfs4 from another server. The CUDA toolkit is then found in the default /usr/local directory:
l /usr/local/
drwxr-xr-x@ - root 20 Aug 11:21 admin/
drwxr-xr-x@ - root 5 Jun 2023 bib/
drwxr-xr-x@ - root 19 Sep 16:41 bin/
lrwxrwxrwx - root 19 Sep 16:38 cuda -> /etc/alternatives/cuda/
drwxr-xr-x@ - root 9 Aug 2022 cuda-10.1/
drwxr-xr-x@ - root 9 Aug 2022 cuda-11.2/
drwxr-xr-x@ - root 4 Aug 2022 cuda-11.7/
lrwxrwxrwx - root 19 Sep 16:38 cuda-12 -> /etc/alternatives/cuda-12/
drwxr-xr-x@ - root 19 Sep 16:38 cuda-12.6/
drwxr-xr-x@ - root 9 Aug 2022 cudnn-10.1-v7.6/
drwxr-xr-x@ - root 9 Aug 2022 cudnn-11.X-v8.4/
and
l /usr/bin/nvcc
lrwxrwxrwx - root 21 Sep 19:12 /usr/bin/nvcc -> /etc/alternatives/nvcc*
as well as the config in update-alternatives:
update-alternatives --display nvcc
nvcc - auto mode
link best version is /usr/local/cuda-12.6/bin/nvcc
link currently points to /usr/local/cuda-12.6/bin/nvcc
link nvcc is /usr/bin/nvcc
/usr/local/cuda-12.6/bin/nvcc - priority 100
update-alternatives --display cuda
cuda - auto mode
link best version is /usr/local/cuda-12
link currently points to /usr/local/cuda-12
link cuda is /usr/local/cuda
/usr/local/cuda-12 - priority 100
update-alternatives --display cuda-12
cuda-12 - auto mode
link best version is /usr/local/cuda-12.6
link currently points to /usr/local/cuda-12.6
link cuda-12 is /usr/local/cuda-12
/usr/local/cuda-12.6 - priority 100
If I configure this project with the symlinked versions of the cuda toolkit, ie.
nvcc → /usr/bin/nvcc
cuda → /usr/local/cuda,
then cmake cannot find the cuda toolkit / compiler:
cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_C_COMPILER=/usr/bin/gcc-12 -DCMAKE_CXX_COMPILER=/usr/bin/g++-12 -G Ninja -DCUDAToolkit_ROOT=/usr/local/cuda -DCMAKE_CUDA_ARCHITECTURES=80 -DCMAKE_CUDA_COMPILER=/usr/bin/nvcc --debug-output -S /work/rleap1/michael.aichmueller/github/xmimir -B /work/rleap1/michael.aichmueller/github/xmimir/cmake-build-debug-gcc12-nvcc-polonium-broken
Running with debug output on.
-- The CXX compiler identification is GNU 12.3.0
Called from: [3] /u/michael.aichmueller/cmake-3.26.5-linux-x86_64/share/cmake-3.26/Modules/CMakeDetermineCompilerId.cmake
[2] /u/michael.aichmueller/cmake-3.26.5-linux-x86_64/share/cmake-3.26/Modules/CMakeDetermineCXXCompiler.cmake
[1] /work/rleap1/michael.aichmueller/github/xmimir/CMakeLists.txt
CMake Error at /u/michael.aichmueller/cmake-3.26.5-linux-x86_64/share/cmake-3.26/Modules/CMakeDetermineCUDACompiler.cmake:227 (message):
Couldn't find CUDA library root.
Call Stack (most recent call first):
CMakeLists.txt:2 (project)
Called from: [2] /u/michael.aichmueller/cmake-3.26.5-linux-x86_64/share/cmake-3.26/Modules/CMakeDetermineCUDACompiler.cmake
[1] /work/rleap1/michael.aichmueller/github/xmimir/CMakeLists.txt
-- Configuring incomplete, errors occurred!
But If I give the true paths, i.e.
nvcc → /usr/local/cuda-12.6/bin/nvcc
cuda → /usr/local/cuda-12.6,
then cmake is able to find everything:
cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_C_COMPILER=/usr/bin/gcc-12 -DCMAKE_CXX_COMPILER=/usr/bin/g++-12 -G Ninja -DCUDAToolkit_ROOT=/usr/local/cuda-12.6 -DCMAKE_CUDA_ARCHITECTURES=80 -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12.6/bin/nvcc --debug-output -S /work/rleap1/michael.aichmueller/github/xmimir -B /work/rleap1/michael.aichmueller/github/xmimir/cmake-build-debug-gcc12-nvcc-polonium
Running with debug output on.
-- CMAKE_CUDA_COMPILER: /usr/local/cuda-12.6/bin/nvcc
Called from: [1] /work/rleap1/michael.aichmueller/github/xmimir/CMakeLists.txt
-- CUDAToolkit_ROOT: /usr/local/cuda-12.6
Called from: [1] /work/rleap1/michael.aichmueller/github/xmimir/CMakeLists.txt
-- Configuring done (0.3s)
-- Generating /work/rleap1/michael.aichmueller/github/xmimir/cmake-build-debug-gcc12-nvcc-polonium
Called from: [1] /work/rleap1/michael.aichmueller/github/xmimir/CMakeLists.txt
-- Generating done (0.0s)
-- Build files have been written to: /work/rleap1/michael.aichmueller/github/xmimir/cmake-build-debug-gcc12-nvcc-polonium
Is this behaviour expected?
Is it due to symlinking with update-alternatives or does cmake simply see multiple cuda-xx folders and decide not to choose, despite being given an explicit path?
If the latter, shouldn’t then at least nvcc be found since there is no alternative in /usr/bin ?