0

I've written a CUDA application which compiles and runs. However, when I try to debug/run them through Eclipse CDT, or through kdbg, I get an error message such as:

/path/to/executable: error while loading shared libraries: libnvToolsExt.so.1: cannot open shared object file: No such file or directory

or a similar message with libcudart.so.10.2 instead.

Why is this happening if the executable runs on its own, and what can I do about it?

Information about my system:

  • A Debian-derived GNU/Linux
  • CUDA 10.2 installed manually (with no distribution-supplied CUDA packages installed)
  • Eclipse CDT version 2018-09 (4.9.0)
  • kdbg version 2.5.5
  • X86_64 machine
einpoklum
  • 118,144
  • 57
  • 340
  • 684

1 Answers1

1

The manual installation of the CUDA toolkit (with or without the nVIDIA kernel driver) does not make its libraries prominently "visible" on the system. If you're using a non-CUDA binary (compiler, linker/loader, etc.) - it will simply not be aware of the installation. Specifically, when you try to run an executable built to use shared libraries, the loader - GNU ld on your system - must be able to find those libraries. For a given executable, you can obtain a list of them using readelf (or using other methods). A typical example:

$ readelf -d my_cuda_app | grep 'NEEDED'
 0x0000000000000001 (NEEDED)             Shared library: [libnvToolsExt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [librt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libcudart.so.10.2]
 0x0000000000000001 (NEEDED)             Shared library: [libcupti.so.10.2]
 0x0000000000000001 (NEEDED)             Shared library: [libOpenCL.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]

There are (at least) two ways of making a shared library accessible (i.e. add it to GNU ld's search path):

  1. Add the library's directory to the LD_LIBRARY_PATH environment variable.
  2. Add the library's directory to /etc/ld.so.conf, or to a file in /etc/ld.so.conf.d in case your /etc/ld.so.conf recrusively reads its configuration from a subdirectory

Unfortunately, the manual CUDA installer does not offer you to apply the second approach, nor does it suggest you may want to do so yourself.

You must have choose the first of these two approaches - and thus can execute your binary from within a shell session. However, Eclipse CDT and kdbg (and possibly other IDEs and debuggers) are rather strict w.r.t. the execution of built programs, and must be "scrubbing" the executables' environment of the LD_LIBRARY_PATH variable.

Instead of, or in addition to, the LD_LIBRARY_PATH addition - create a file named /etc/ld.so.conf.d/cuda, with your manual CUDA installation's library directory, e.g.:

/usr/local/cuda-10.2/targets/x86_64-linux/lib

This should allow kdbg and Eclipse CDT to debug your app.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
  • 2
    Yes, that's exactly what's happening. I'm a little on the fence here, if this particular instance of Q&A should remain on StackOverflow through. Working with libraries and how the linker locates library, and what influences it a) goes far beyond CUDA and b) should be considered foundational developer knowledge. I'd rather have a more general Q&A here, or have this particular instance deleted. It's kind of redundant, and to superficial eyes might look like a corner case, while it's not. – datenwolf May 16 '20 at 10:35
  • @datenwolf: A lot of questions about specific frameworks/libraries/environments end up being a specific case of a more general/fundamental issue. However - if that issue comes up easily or often, without the developer messing up their system in some esoteric way - then such questions are legitimate. Having said that - if there's a Q&A about the same issue that doesn't mention CUDA particularly, this question can probably be made a dupe of that one. Also, see my upcoming edit. – einpoklum May 16 '20 at 18:57