0

I'm linking a program using NVIDIA's PTX compiler library, with a link command generated CMake, like so:

usr/bin/c++ -O3 -DNDEBUG \
    CMakeFiles/vectorAdd_ptx.dir/modified_cuda_samples/vectorAdd_ptx/vectorAdd_ptx.cpp.o \
    -o bin/vectorAdd_ptx  \
    -Wl,-rpath,/usr/local/cuda-11.7/lib64:/usr/local/cuda-11.7/lib64/stubs \
    /usr/local/cuda-11.7/lib64/libcudart.so \
    -lpthread \
    /usr/local/cuda-11.7/lib64/libnvrtc.so \
    /usr/local/cuda-11.7/lib64/stubs/libcuda.so \
    /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a 

or rather, a GitHub Actions Ubuntu 20 VM is linking my program this way. This command yields the following output:

usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx989':
stdThreads.cpp:(.text+0x3c): undefined reference to `pthread_key_create'
/usr/bin/ld: stdThreads.cpp:(.text+0x44): undefined reference to `pthread_mutexattr_init'
/usr/bin/ld: stdThreads.cpp:(.text+0x51): undefined reference to `pthread_mutexattr_settype'
/usr/bin/ld: stdThreads.cpp:(.text+0x68): undefined reference to `pthread_mutexattr_destroy'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx2472':
stdThreads.cpp:(.text+0xfe): undefined reference to `pthread_setspecific'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx18096':
stdThreads.cpp:(.text+0x14e): undefined reference to `pthread_join'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx18072':
stdThreads.cpp:(.text+0x169): undefined reference to `sem_init'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx18073':
stdThreads.cpp:(.text+0x1ce): undefined reference to `sem_wait'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx18075':
stdThreads.cpp:(.text+0x1fe): undefined reference to `sem_trywait'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx18035':
stdThreads.cpp:(.text+0x262): undefined reference to `pthread_mutexattr_init'
/usr/bin/ld: stdThreads.cpp:(.text+0x26f): undefined reference to `pthread_mutexattr_settype'
/usr/bin/ld: stdThreads.cpp:(.text+0x284): undefined reference to `pthread_mutexattr_destroy'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx2045':
stdThreads.cpp:(.text+0x384): undefined reference to `sem_destroy'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx2133':
stdThreads.cpp:(.text+0x425): undefined reference to `pthread_key_delete'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx18094':
stdThreads.cpp:(.text+0x4cc): undefined reference to `pthread_getspecific'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx18102':
stdThreads.cpp:(.text+0x528): undefined reference to `sem_init'
/usr/bin/ld: stdThreads.cpp:(.text+0x56e): undefined reference to `sem_wait'
/usr/bin/ld: stdThreads.cpp:(.text+0x591): undefined reference to `sem_destroy'
/usr/bin/ld: stdThreads.cpp:(.text+0x5a8): undefined reference to `pthread_key_delete'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx2076':
stdThreads.cpp:(.text+0x632): undefined reference to `sem_init'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx18099':
stdThreads.cpp:(.text+0x6b9): undefined reference to `pthread_getspecific'
/usr/bin/ld: stdThreads.cpp:(.text+0x6e2): undefined reference to `pthread_setspecific'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx18101':
stdThreads.cpp:(.text+0x7ce): undefined reference to `sem_wait'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx14389':
stdThreads.cpp:(.text+0x836): undefined reference to `pthread_attr_setstacksize'
/usr/bin/ld: stdThreads.cpp:(.text+0x862): undefined reference to `pthread_create'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx17995':
stdThreads.cpp:(.text+0x939): undefined reference to `pthread_getspecific'
/usr/bin/ld: stdThreads.cpp:(.text+0x962): undefined reference to `pthread_setspecific'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx17967':
stdThreads.cpp:(.text+0xb7e): undefined reference to `sem_wait'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx17962':
stdThreads.cpp:(.text+0xbbf): undefined reference to `sem_post'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx18098':
stdThreads.cpp:(.text+0x128): undefined reference to `sem_post'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx18097':
stdThreads.cpp:(.text+0x138): undefined reference to `pthread_kill'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx18074':
stdThreads.cpp:(.text+0x181): undefined reference to `sem_destroy'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx18076':
stdThreads.cpp:(.text+0x221): undefined reference to `sem_post'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libnvptxcompiler_static.a(ptxtmp245.o): in function `__ptx2045':
stdThreads.cpp:(.text+0x39d): undefined reference to `sem_post'
/usr/bin/ld: /usr/local/cuda-11.7/lib64/libn
vptxcompiler_static.a(ptxtmp245.o): in function `__ptx18094':
stdThreads.cpp:(.text+0x4ea): undefined reference to `pthread_setspecific'
collect2: error: ld returned 1 exit status

I don't quite understand what's going on. I have linked against the pthreads library, haven't I? And - where does stdThreads.cpp come from? The standard library, or the NVIDIA PTX compiler library? Perhaps I need to link against pthreads different because libnvptxcompiler_static.a is a static library?

I'll mention that when I build this program on my own system (Devuan Daedalus), the link command is:

/usr/bin/c++ -g \
    CMakeFiles/vectorAdd_ptx.dir/modified_cuda_samples/vectorAdd_ptx/vectorAdd_ptx.cpp.o \
    -o bin/vectorAdd_ptx  \
    /usr/local/cuda/lib64/libcudart.so \
    /usr/local/cuda/lib64/libnvrtc.so /usr/lib/x86_64-linux-gnu/libcuda.so \
    /usr/local/cuda/lib64/libnvptxcompiler_static.a 

(it's generated by CMake, then broken up into multiple lines to fit the width here), and it succeeds.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
  • 2
    Linking statements are read left to right. Specify the dependency *after* the library which requires it, not before – talonmies Oct 28 '22 at 13:52

1 Answers1

0

tl;dr: In your CMakeLists.txt file, place a pthreads library dependency after the PTX compiler library dependency.

(Thanks goes to @talonmies for this.)

When a library lib1 uses symbols from lib2, not defined in lib1, the linker can only resolve these symbols if it sees lib2 after it has seen lib1, i.e. to the right of lib1 on the command-line. In your case, you likely have a CMake target_link_libraries() arrangement in which a dependency on lib2 is considered before the dependence on lib1 - and placed on the command-line before lib1. Example:

target_link_library(lib1 Threads::Threads)
target_link_library(lib2 lib1 libnvprofiler_static)

you need to rearrange this CMake code so that, upon resolution, CUDA::nvptxcompiler_static appears before Threads::Threads as a dependency, e.g.

target_link_library(lib1 Threads::Threads)
target_link_library(lib2 libnvprofiler_static lib1)
einpoklum
  • 118,144
  • 57
  • 340
  • 684