CUDA: Linking a shared library .so using separate files

Question

I'm trying to compile a .so library using nvcc 6.0 from separate .cu files. I managed to compile each file separately using the -rdc=true. When I try to link my libraries using c I get a bunch of errors. I have compiled already from a single library. I read in a question from nvcc 5.0 that this is not supported here. I went into the manual of nvcc 6.0 but could not find (or understand) if that's the case. Bellow is my makefile (I'm not very experienced on writing makefiles so any advice is very welcomed). The error is pasted afterwards

NCC = /usr/local/cuda-6.0/bin/nvcc
CC = g++

LCUDA = -L/usr/local/cuda/lib64 -lcuda -lcudart
LNUM = -lm

OOP = -arch=sm_30 -rdc=true --shared -Xcompiler -fPIC -c

all: cuda_ddm.so

cuda_ddm.so : wfpt.o stationary.o
$(CC) -Wall -shared -include ./c_cuda_ddm.h -o $@ $^ $(LCUDA) 

wfpt.o : wfpt.cu
$(NCC) $(OOP) $@ $^ 

test.o : test.cu
$(NCC) $(OOP) $@ $^

The errors:

(Edit: I changed the compiler error to account for the current situation.)

/usr/local/cuda-6.0/bin/nvcc -arch=sm_30 -rdc=true --shared -Xcompiler -fPIC -c wfpt.o wfpt.cu 
/usr/local/cuda-6.0/bin/nvcc -arch=sm_30 -rdc=true --shared -Xcompiler -fPIC -c test.o test.cu 
g++ -Wall -shared -o cuda_ddm.so wfpt.o test.o -L/usr/local/cuda/lib64 -lcuda -lcudart 
test.o: In function `big_random_block(int)':
tmpxft_00003c24_00000000-3_test.cudafe1.cpp:(.text+0x5e): multiple definition of `big_random_block(int)'
wfpt.o:tmpxft_00003bc7_00000000-3_wfpt.cudafe1.cpp:(.text+0x5e): first defined here
test.o: In function `big_random_block_int(int)':
tmpxft_00003c24_00000000-3_test.cudafe1.cpp:(.text+0xde): multiple definition of `big_random_block_int(int)'
wfpt.o:tmpxft_00003bc7_00000000-3_wfpt.cudafe1.cpp:(.text+0xde): first defined here
test.o: In function `value(float, float, int)':
tmpxft_00003c24_00000000-3_test.cudafe1.cpp:(.text+0x169): multiple definition of `value(float, float, int)'
wfpt.o:tmpxft_00003bc7_00000000-3_wfpt.cudafe1.cpp:(.text+0x169): first defined here
test.o: In function `__device_stub__Z14float_to_colorPhPKf(unsigned char*, float const*)':
tmpxft_00003c24_00000000-3_test.cudafe1.cpp:(.text+0x788): multiple definition of `__device_stub__Z14float_to_colorPhPKf(unsigned char*, float const*)'
wfpt.o:tmpxft_00003bc7_00000000-3_wfpt.cudafe1.cpp:(.text+0xb30): first defined here
test.o: In function `float_to_color(unsigned char*, float const*)':
tmpxft_00003c24_00000000-3_test.cudafe1.cpp:(.text+0x7f9): multiple definition of `float_to_color(unsigned char*, float const*)'
wfpt.o:tmpxft_00003bc7_00000000-3_wfpt.cudafe1.cpp:(.text+0xba1): first defined here
test.o: In function `__device_stub__Z14float_to_colorP6uchar4PKf(uchar4*, float const*)':
tmpxft_00003c24_00000000-3_test.cudafe1.cpp:(.text+0x81e): multiple definition of `__device_stub__Z14float_to_colorP6uchar4PKf(uchar4*, float const*)'
wfpt.o:tmpxft_00003bc7_00000000-3_wfpt.cudafe1.cpp:(.text+0xbc6): first defined here
test.o: In function `float_to_color(uchar4*, float const*)':
tmpxft_00003c24_00000000-3_test.cudafe1.cpp:(.text+0x88f): multiple definition of `float_to_color(uchar4*, float const*)'
wfpt.o:tmpxft_00003bc7_00000000-3_wfpt.cudafe1.cpp:(.text+0xc37): first defined here
collect2: error: ld returned 1 exit status
make: *** [cuda_ddm.so] Error 1

Edit:

To clarify the situation, I changed the code to be 100% sure that there is no overlapping code in both files. I have in both a # include "c_cuda_ddm.hcu" with the following content:

# ifndef DDM_HEADER
# define DDM_HEADER

#include "book.h"
#include "math.h"

# define TOL 1e-7
# define PI 3.1415926535

# define DIM_X 0
# define DIM_Y 2
# define DIM_U 2
# define DIM_THETA 3
# define DIM_PTHETA 0

# define INDEX_V 0
# define INDEX_A 1
# define INDEX_W 2

# define CUE_LEFT 1
# define CUE_RIGHT 0
# define ANTISACCADE_TYPE 0
# define PROSACCADE_TYPE 1

// Number of threads for the predictive posterior
# define DDMBLOCKS 256
# define PPBLOCKS 1024
# define LLHBLOCKS 16 
# endif

__device__ double lp_ddm(double t, double v, double a, double w);

extern "C"
int llh_ddm(double *t, double *v, double *a, double *w, int ny,
    double *llh);


extern "C"
int llh_stationary_antisaccades(double *x, double *y, double *u,
    double *theta, double *ptheta, int ny, double *llh);

extern "C"
int lpp_stationary_antisaccades(double *x, double *y, double *u,
    double *theta, double *ptheta, int ny, int ns, double *llh);

The error is clear - you have compiled the same code twice, once in `wfpt.o` and once in `stationary.o` (presumably from including the same code into both source files). The problem probably has nothing to with the makefile you have shown. — talonmies, Jul 01 '14 at 12:51
@talonmies I verified if the problem is the one that you mentioned. I'm almost positive that it is not related to that, but look at my edit. — eaponte, Jul 01 '14 at 14:55
It's true that you are missing a device link step in your make file. When you compile with `-dc`, you cannot simply go to `g++` to link things together. A separate, intermediate device link step is required. However, this is not the source of the problem indicated in this question. @talonmies response is correct, and your edit changes nothing. Include guards don't prevent including the same code into 2 separate files. They only prevent multiple inclusion into a *single* file. A simplified test case is necessary to sort all this out for you, but you haven't provided it, as SO expects. — Robert Crovella, Jul 01 '14 at 15:10
Sorry @RobertCrovella, I'm slow... I don't get it. I looked at both files and made sure that there is no overlapping code (I can cat one of them into the other and the code compiles -given that I change my makefiles). So I'm not sure why the code is twice in both files. I guess I'm still getting a compilation error, but it might be a different error. I will update the error that I'm getting. — eaponte, Jul 01 '14 at 15:15
A simple change to your makefile, switching from `g++` to `nvcc` to do the final creation step of your `.so`, would probably be sufficient to sort out the device link requirement. However it will not (and cannot) fix the problem of including the same code into two separate files. After you get past the same code problem, you may want to study the examples given [here](http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#examples). It's certainly possible to build a shared object out of components that are separately compiled, i.e. `-dc` or `-rdc=true -c` — Robert Crovella, Jul 01 '14 at 15:21
If you have proper include guards, "cat one of them into the other" might very well fix the problem indicated in this question. It's impossible to say without a complete example. And we don't really need to see *your* files, just stripped down representative examples, that can be compiled and produce a similar problem to what you have shown. — Robert Crovella, Jul 01 '14 at 15:29
Your edited error output simply indicates you are still facing the same problem. Are you including the same file or files into both `wfpt.cu` and `test.cu`? If so, it's probably sufficient just to show whatever file or files you are including into both. — Robert Crovella, Jul 01 '14 at 15:32
Thank you @RobertCrovella. So it seems that the problem is my include statement and or my header file. If I delete the header, I stop having the problem. I don't need to change the compiler. I guess I'm not getting something very obvious about the include statement. (I put the header file and the include statements in the question). I will put your insights in an answer to the problem. — eaponte, Jul 01 '14 at 15:38
"I put the header file and the include statements in the question." I assumed when you put etc.. after the header file in the question, that there was more of it you weren't showing. Have you shown the *entire* header file or files that you are including into both modules that produces the error? — Robert Crovella, Jul 01 '14 at 15:47
I pasted the whole header file. It's just a few functions. Their definitions are only in one of the files. — eaponte, Jul 01 '14 at 15:52
@eaponte: things would be a lot simpler if you would just read the error messages. However it is happening, you are compiling big_random_block, big_random_block_int, value, and float_to_color twice. Once in wfpt.o and once in test.o . This is undisputed. The linker is telling you *exactly* what the problem is. None of your edits tells us why it is happening. Look through your code and find where those functions are defined. Then analyse how those functions are getting compiled into each object file you are trying to link. Only you can diagnose and fix this problem. Over to you..... — talonmies, Jul 02 '14 at 05:31
@talonmies, the confusion on my side is that I don't define any of those functions in my code. They are not part of it, so I didn't know how to interpret the error. Those error messages are undisputed, where do they come from, I don't know. — eaponte, Jul 03 '14 at 09:00

score 1 · Accepted Answer · answered Jul 03 '14 at 19:17

big_random_block(), float_to_color (a kernel) and probably all your other duplicate definitions are coming from book.h.

This header file differs from (what I consider to be common practice in) other header files in that it does not just include function prototypes, but actual function definitions.

Therefore book.h can only be (successfully/properly/safely) included in a single file (i.e. compilation unit) in your entire project. If you include it in multiple files, you will get the same functions defined in multiple modules, which will lead to problems if you try to link those modules together.

The fix is to only include book.h in one file, or better yet just grab what you need and create your own properly organized header file from it. book.h is a header file that was intended to accompany the CUDA by example book. While I'm sure it works fine for all the projects in that book, it seems clear that you cannot just pick it up and sprinkle it through any project willy-nilly. Some header files may work that way. This one will not.

And as an aside, I wish to re-state the point that separate compilation (and linking) cannot be accomplished with compile-only steps (-rdc=true -c). It also requires a device link step. It may be that if your two object files (wfpt.o and stationary.o don't actually share or require any CUDA symbols or entry points, then it may not matter. But if there is shared cuda entry points between the modules, the device link step is necessary. This is not the crux of your question, however, and if it is ultimately needed, you will surely discover that your compile sequence as depicted in this question is not correct.

I started learning CUDA last week trying to extend the examples in the and completely disregard the header file... I didn't manage to think about that header as the source of the problem. @RobertCrovella, in order to do the device link step, should I simply link with nvcc? — eaponte, Jul 04 '14 at 09:06
If a device link step is required, then that means `nvcc` is required to do it (`g++` knows nothing about device code or device code linking). Without covering all the nuances of the command, if in general, you switch your final assembly step from `g++` to `nvcc`, then any device linking required can be accomplished in that step, as well as the creation of the `.so` The [nvcc manual](http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#using-separate-compilation-in-cuda) covers these things. — Robert Crovella, Jul 04 '14 at 10:20

CUDA: Linking a shared library .so using separate files

1 Answers1