Linking cufft_static with a CXX target

The cuFFT/1d_c2c sample by Nvidia provides a CMakeLists.txt which links CUDA::cufft. Modifying it to link against CUDA::cufft_static causes a lot of linking issues. The cuFFT docs provide some guidance here, so I modified the CMakeLists.txt accordingly to link against CMAKE_DL_LIBS and pthreads (Threads::Threads) and turned on CUDA_SEPARABLE_COMPILATION. This still doesn’t work as CMake invokes g++ for linking instead of nvcc. I would have thought that setting the LINKER_LANGUAGE property to CUDA would fix this, but it does not. Is that a bug or am I missing something?

The easy way around this is to change the source file name to use .cu instead of .cpp. One still needs the CUDA_SEPARABLE_COMPILATION property, but no explicit pthreads and CMAKE_DL_LIBS. But as the file doesn’t contain actual CUDA (i.e. device) code I see this only as a workaround.

My modified (and cleaned up) CMakeLists.txt:

cmake_minimum_required(VERSION 3.18)

set(ROUTINE 1d_c2c)

project(
  "${ROUTINE}_example"
  DESCRIPTION "GPU-Accelerated Fast Fourier Transforms"
  HOMEPAGE_URL "https://docs.nvidia.com/cuda/cufft/index.html"
  LANGUAGES CUDA CXX)

find_package(CUDAToolkit REQUIRED)
find_package(Threads REQUIRED)

add_executable(${ROUTINE}_example)

set_target_properties(${ROUTINE}_example
  PROPERTIES
    LINKER_LANGUAGE CUDA
    CUDA_SEPARABLE_COMPILATION ON
    RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/bin)

target_compile_features(${ROUTINE}_example
  PRIVATE cxx_std_11)

target_sources(${ROUTINE}_example
  PRIVATE ${PROJECT_SOURCE_DIR}/${ROUTINE}_example.cpp)

target_include_directories(${ROUTINE}_example
  PRIVATE ${CMAKE_SOURCE_DIR}/../utils)

target_link_libraries(${ROUTINE}_example PRIVATE
  PRIVATE
    CUDA::cufft_static
    ${CMAKE_DL_LIBS}
    Threads::Threads)

I am using CMake 3.23.1, CUDA Toolkit 11.8 and gcc 11.3

CUDA_SEPARABLE_COMPILATION is needed when you are compiling code itself that requires separate compilation. In your case you just need the device linking step to occur, and so enabling the CUDA_RESOLVE_DEVICE_SYMBOLS will be sufficient ( works locally with a c++ source file ).

When building locally I modified the target_link_libraries call to look like:

target_link_libraries(1d_c2c PRIVATE
  PRIVATE
    CUDA::cufft_static
    CUDA::cudart_static
    CUDA::culibos
    )

That correctly compiles and links. Since the example uses methods like cudaMallocAsync we do need to link to the cuda runtime ( CUDA::cudart_static ).

But you have identified an issue with CUDA::cudfft where it isn’t expressing the proper link requirements on pthreads and dl which I will fix.

1 Like

CUDA::culibos should automatically come with CUDA::cufft_static according to the docs.

Replacing CUDA_SEPARABLE_COMPILATION with CUDA_RESOLVE_DEVICE_SYMBOLS and adding CUDA::cudart_static is enough for it to compile for me (even without setting LINKER_LANGUAGE to CUDA).

Are you saying that CUDA::cudart_static expresses these requirements for pthreads and dl, but CUDA::cufft_static should as well so that it could be used without CUDA::cudart_static (e.g. with the driver API, I guess)?

I was actually asking myself if it is expected that CUDA::cufft_static comes with culibos, but the cufft_static that becomes available just by using the CUDA language without find_package(CUDAToolkit) does not automatically include it.

Are these targets without the CUDA:: namespace a feature or just an implementation detail. In contrast to the CUDA:: ones they don’t seem to be documented at all?

Mainly correct. This allows CUDA::cufft_static to be used with either the shared or static cudart ( CUDA::cudart or CUDA::cudart_static.

No CUDA:: targets are brought in by enabling the CUDA language. These are done by find_package(CUDAToolkit) which documents all the user facing targets ( CUDA::cufft_static ).
The non namespaced names aren’t targets, but are just libraries on disk. So target_link_libraries(my_target PUBLIC cufft) just expects cufft.so or cufft.a to exist in an implicit link directory of the system. Which for projects that have CUDA enabled is the case.

1 Like

Makes sense, thank you!