Undefined symbol when trying to link with shared library built from CUDA objects

Question

I'm experimenting with building a simple application from a couple of .cu source files and a very simple C++ main that calls a function from one of the .cu files. I'm making a shared library (.so file) from the compiled .cu files. I'm finding that everything builds without trouble, but when I try to run the application, I get a linker undefined symbol error, with the mangled name of the .cu function I'm calling from main(). If I build a static library instead, my application runs just fine. Here's the makefile I've set up:

.PHONY: clean
NVCCFLAGS = -std=c++11 --compiler-options '-fPIC'
CXXFLAGS = -std=c++11
HLIB = libhello.a
SHLIB = libhello.so
CUDA_OBJECTS = bridge.o add.o

all: driver

%.o :: %.cu
    nvcc -o $@ $(NVCCFLAGS) -c -I. $<

%.o :: %.cpp
    c++ $(CXXFLAGS) -o $@ -c -I. $<

$(HLIB): $(CUDA_OBJECTS)
    ar rcs $@ $^

$(SHLIB): $(CUDA_OBJECTS)
    nvcc $(NVCCFLAGS) --shared  -o $@ $^

#driver : driver.o $(HLIB)
#   c++ -std=c++11 -fPIC -o $@ driver.o -L. -lhello -L/usr/local/cuda-10.1/targets/x86_64-linux/lib -lcudart

driver : driver.o $(SHLIB)
    c++ -std=c++11 -fPIC -o $@ driver.o -L. -lhello

clean:
    -rm -f driver *.o *.so *.a

Here are the various source files that the makefile takes as fodder. add.cu:

__global__ void add(int n, int* a, int* b, int* c) {
    int index = threadIdx.x;
    int stride = blockDim.x;

    for (int ii = index; ii < n; ii += stride) {
        c[ii] = a[ii] + b[ii];
    }
}

add.h:

extern __global__ void add(int n, int* a, int* b, int* c);

bridge.cu:

#include <iostream>
#include "add.h"

void bridge() {
    int N = 1 << 16;
    int blockSize = 256;
    int numBlocks = (N + blockSize - 1)/blockSize;

    int* a;
    int* b;
    int* c;

    cudaMallocManaged(&a, N*sizeof(int));
    cudaMallocManaged(&b, N*sizeof(int));
    cudaMallocManaged(&c, N*sizeof(int));

    for (int ii = 0; ii < N; ii++) {
        a[ii] = ii;
        b[ii] = 2*ii;
    }

    add<<<numBlocks, blockSize>>>(N, a, b, c);

    cudaDeviceSynchronize();

    for (int ii = 0; ii < N; ii++) {
        std::cout << a[ii] << " + " << b[ii] << " = " << c[ii] << std::endl;
    }

    cudaFree(a);
    cudaFree(b);
    cudaFree(c);
}

bridge.h:

extern void bridge();

driver.cpp:

#include "bridge.h"

int main() {
    bridge();
    return 0;
}

I'm very new to cuda, so I expect that's where I'm doing something wrong. I've played a bit with using extern "C" declarations, but that just seems to move the "undefined symbol" error from run time to build time.

I'm familiar with various ways that one can end up with an undefined symbol, and I've mentioned various experiments I've already performed (static linking, extern "C" declarations) that make me think that this problem isn't addressed by the proposed duplicate question.

My unresolved symbol is _Z6bridgev

It looks to me as though the linker should be able resolve the symbol. If I can nm on driver.o, I see:

0000000000000000 T main
                 U _Z6bridgev

And if I run nm on libhello.so, I see:

0000000000006e56 T _Z6bridgev

The cited dup is part of the issue, but I don't think it is the controlling issue in this case. I think the OP needs help with fixing the makefile, and explain how to write recipes and include the variables like `LDFLAGS` and `LDLIBS`. Reopen. — jww, Jun 17 '19 at 19:22
@jww Thanks for your comment. More specifically, am I using nvcc improperly in building a shared library, or missing some flags in my link command? I've edited my question again to show that I believe the library I've built has the symbol that driver.o needs to have resolved. — user888379, Jun 17 '19 at 19:34
`echo _Z6bridgev | c++filt` results in `bridge()`. What library is the `bridge()` function in, and where is the library located? (The problem is in the way the makefile is crafted; but some information is missing to give you a complete answer). — jww, Jun 17 '19 at 19:36
@jww I've edited my question to fill in more details. My test program is a bit elaborate because the real program I'm working on is eventually going to need a shared library built up from compiled cuda sources. — user888379, Jun 17 '19 at 20:01
I assembled a complete project based on what you have shown here with no modifications of any kind to the Makefile, with a driver.cpp exactly as you describe and all the other files needed (bridge.cu, add.cu, etc). It compiles and the code runs correctly for me. — Robert Crovella, Jun 17 '19 at 20:11
[Here](https://pastebin.com/Rrm6jVHP) is a complete test case. What you have shown works. There is something important you are not telling us. Which is why its usually good practice on SO to provide a [mcve]. What you have shown so far is not one, and I've provided proof of that. — Robert Crovella, Jun 17 '19 at 20:17
without a [mcve] this question really ought to be closed, and there is a specific failure-to-provide-mcve reason for closure of course, provided by SO. However the duplicate closure might have been a more friendly option, since it at least provides some direction, since the question itself is not useful in its current form. This question should be closed. — Robert Crovella, Jun 17 '19 at 20:19
@RobertCrovella Unfortunately, I'm blocked from your test case. I will edit my question to include the guts of the several source files. — user888379, Jun 17 '19 at 20:29
Here's what you should do. Don't dribble out more bits and pieces. Instead, post what you think is a complete test case. Then, start over with a fresh directory on your machine with nothing in it. Then copy from your posting the files into that directory, and run the Makefile. In other words, exactly what I would do, if I wanted to see the problem. When that process produces the problem, then you will be reasonably confident that you have provided a proper test case. Yes, it requires effort. — Robert Crovella, Jun 17 '19 at 20:31
@RobertCrovella I'm not objecting to the effort involved. I'm a little annoyed with myself that I didn't start by posting the source with the makefile - which would have been less effort. — user888379, Jun 17 '19 at 20:40
Your updated example also works correctly for me. One possible item that remains is a link order issue. Although I'm not able to spot it. You should probably provide the exact tools versions you are using. It would also be helpful if you provided the output of the make command that shows the error. I'm a bit confused by the wording of your question because you say that this error shows up when you run it, but it looks like a linking error to me. Anyway the actual command + output that shows this error would make things clearer. — Robert Crovella, Jun 17 '19 at 20:56
It occurred to me on my way home from work that "." isn't on my PATH. I think that would explain the various symptoms I'm seeing. Thanks again for your forbearance. I'll test this hypothesis when I return to work tomorrow. — user888379, Jun 17 '19 at 21:34

score 1 · Answer 1 · answered Jun 17 '19 at 23:35

When Robert Crovella was able to get my example to work on his machine, while I wasn't able to get his example to work on mine, I started realizing that my problem had nothing to do with cuda or nvcc. It was the fact that with a shared library, the loader has to resolve symbols at runtime, and my shared library wasn't in a "well-known location". I built a simple test case just now, purely with c++ sources, and repeated my failure. Once I copied libhello.so to /usr/local/lib, I was able to run driver successfully. So, I'm OK with closing my original question, if that's the will of the people.

Undefined symbol when trying to link with shared library built from CUDA objects

1 Answers1