Linkage of libcublasLt.so from CUDA_TOOLKIT_ROOT_DIR/lib64

I’m building a library that depends on CUDA & cuBLAS.
A peculiarity I’m seeing is that the paths are resolved for the CUDA
& cuBLAS libraries

--> ldd libCUDA-SGEMM.so
linux-vdso.so.1 (0x00007ffda4173000)
libcuda.so.1 => /usr/lib/x86_64-linux-gnu/libcuda.so.1 (0x00007f5360f88000)
libcublas.so.10 => /gpfs/fs1/SHARE/Utils/CUDA/10.2.89.0_440.33.01/lib64/libcublas.so.10 (0x00007f535ccd2000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f535c8e1000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f535c543000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f535c33f000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f535c120000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f535bf18000)
libcublasLt.so.10 => not found
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f535bb8f000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f535b977000)
/lib64/ld-linux-x86-64.so.2 (0x00007f5362373000)

while the path for libcublasLt.so is not.
Explicitly setting the $LD_LIBRARY_PATH takes care of this
at runtime, but I would prefer to have the path already set so the
user doesn’t have to figure this out.
These are the rules that I’m using in my CMakeLists.txt file

find_package(CUDA QUIET REQUIRED) # Keep this form for backward-compatibility with older CMake.
link_directories(${CUDA_TOOLKIT_ROOT_DIR}/lib64)
link_directories(${CUDA_TOOLKIT_ROOT_DIR}/lib64/stubs)
# Dummy library for builds on non-GPU nodes.
list(APPEND CUDA_DEV_LIBRARIES ${CUDA_LIBRARIES} ${CUDA_CUBLAS_LIBRARIES} ${CUDA_cublas_LIBRARY} ${CUDA_driver_LIBRARY} **${CUDA_TOOLKIT_ROOT_DIR}/lib64/libcublasLt.so**)
list(APPEND CUDA_CUBLAS_LIBRARIES ${CUDA_TOOLKIT_ROOT_DIR}/lib64/libcublasLt.so)
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} --std=c++11")
CUDA_ADD_LIBRARY(CUDA-SGEMM SHARED CUDADriver.c)
target_link_libraries(CUDA-SGEMM cuda cublas **cublasLt**)

which I put together by trial-and-error. I’m using cmake
3.16.2.
I would hope there’s just some simple setting I need to use – maybe
I’m over-specifying in the above rules – but I haven’t found any
info on this by googling-around on the web.
I get the impression that cmake has some mechanisms
already in place for handling CUDA & cuBLAS but hadn’t been
built to handle cublasLt.

I had originally posted this as a “Usage” issue but it looks like “code” is more correct.
I’m sorry about spamming everyone with duplicate messages, I’m new to this forum.

If you run readelf -a libCUDA-SGEMM.so | grep -i path what are RUNPATH that are being stored? In addition what path are you adding to LD_LIBRARY_PATH?

The readelf gives me this

 0x000000000000001d (RUNPATH)   Library runpath: [/gpfs/fs1/SHARE/Utils/CUDA/10.2.89.0_440.33.01/lib64:/gpfs/fs1/SHARE/Utils/CUDA/10.2.89.0_440.33.01/lib64/stubs]

which corresponds to the arguments I’d used here

  link_directories(${CUDA_TOOLKIT_ROOT_DIR}/lib64)
  link_directories(${CUDA_TOOLKIT_ROOT_DIR}/lib64/stubs)

and the first directory contains the library and its links

 /gpfs/fs1/SHARE/Utils/CUDA/10.2.89.0_440.33.01/lib64/libcublasLt.so -> libcublasLt.so.10
 /gpfs/fs1/SHARE/Utils/CUDA/10.2.89.0_440.33.01/lib64/libcublasLt.so.10 -> libcublasLt.so.10.2.2.89
 /gpfs/fs1/SHARE/Utils/CUDA/10.2.89.0_440.33.01/lib64/libcublasLt.so.10.2.2.89

If I put this same path on the $LD_LIBRARY_PATH then I get correct resolution

libcublasLt.so.10 => /gpfs/fs1/SHARE/Utils/CUDA/10.2.89.0_440.33.01/lib64/libcublasLt.so.10
libcublas.so.10 => /gpfs/fs1/SHARE/Utils/CUDA/10.2.89.0_440.33.01/lib64/libcublas.so.10 (0x00007fb99e495000)

What puzzles me is why I have to do this for libcublasLt.so.10 but not libcublas.so.10, which is already linked correctly without setting $LD_LIBRARY_PATH

Okay I am pretty sure I understand why this is failing.

RUNPATH on an executable is only used for finding direct dependencies and not indirect. In this case cublasLt is being treated as an indirect dependency ( cublas depends on it, but libcublas.so.10 has no relevant RUNPATH values ).

So why is cublasLt an indirect dependency, while you are explicitly requesting it on the link line? Well the linker by default is now dropping unneeded libraries final since you are not directly calling any symbols that are inside of it ( or they are all weak symbols ).

The quickest solution is to add the following to your CMakeLists.txt to disable this linking behavior:

target_link_options(CUDA-SGEMM PRIVATE "LINKER:--no-as-needed")

The longer term solution is to work with cublas team to have them setup the proper RUNPATH values for loading of cublasLt to work no matter where it is installed.

This didn’t seem to help. Does the flag need to be applied some other way?
I see the flag being passed through in the file

./CMakeFiles/CUDA-SGEMM.dir/link.txt

here

/usr/bin/c++ -fPIC  -Wl,--no-as-needed -shared -Wl,-soname,libCUDA-SGEMM.so -o libCUDA-SGEMM.so CMakeFiles/CUDA-SGEMM.dir/CUDADriver.c.o   -L/gpfs/fs1/SHARE/Utils/CUDA/10.2.89.0_440.33.01/lib64  -L/gpfs/fs1/SHARE/Utils/CUDA/10.2.89.0_440.33.01/lib64/stubs  -Wl,-rpath,/gpfs/fs1/SHARE/Utils/CUDA/10.2.89.0_440.33.01/lib64:/gpfs/fs1/SHARE/Utils/CUDA/10.2.89.0_440.33.01/lib64/stubs /gpfs/fs1/SHARE/Utils/CUDA/10.2.89.0_440.33.01/lib64/libcudart_static.a -lpthread -ldl /usr/lib/x86_64-linux-gnu/librt.so -lcuda -lcublas 

But it doesn’t show up in the logs, I see the --as-needed being used instead, in

./CMakeFiles/CMakeOutput.log

Ok this seemed to do it:

target_link_options(CUDA-SGEMM PRIVATE "LINKER:--no-as-needed")
target_link_options(CUDA-SGEMM PRIVATE "LINKER:-lcublasLt")

I’m going to go ahead and close this one. Thanks for the help!!