0

I am trying to compile cuda object files with nvcc, and compile the final main script using the g++ compiler. I have seen this post but wasn't able to get my example working. The error I am getting seems to be a linkage error:

nvcc -c -g -I -dlink -I/usr/local/cuda-11/include -I. -L/usr/local/cuda-11/lib64 -lcudart -lcurand module.cu -o module.o 
g++ -I/usr/local/cuda-11/include -I. -L/usr/local/cuda-11/lib64 -lcudart -lcurand module.o main.cpp -o main
module.o: In function `call_kernel()':
/home/ubuntu/Desktop/CUDA/MPI&CUDA/module.cu:16: undefined reference to `__cudaPushCallConfiguration'
module.o: In function `__cudaUnregisterBinaryUtil()':
/usr/local/cuda-11/include/crt/host_runtime.h:259: undefined reference to `__cudaUnregisterFatBinary'
module.o: In function `__nv_init_managed_rt_with_module(void**)':
/usr/local/cuda-11/include/crt/host_runtime.h:264: undefined reference to `__cudaInitModule'

What am I doing wrong ? I am aware I could simply compile main.cpp with nvcc but it is something I don't want, as in my problem, I will replace g++ with mpicxx later and have MPI code inside my main.cpp script.

My makefile is:

INC := -I$(CUDA_HOME)/include -I.
LIB := -L$(CUDA_HOME)/lib64 -lcudart -lcurand
CUDAFLAGS=-c -g -I -dlink $(INC) $(LIB)

all: main

main: module.o
    g++ $(INC) $(LIB) module.o main.cpp -o main

module.o: module.cu module.h 
    nvcc -c -g -I -dlink $(INC) $(LIB) module.cu -o module.o 

clean: 
    rm -rf *.o

main.cpp

#include "module.h"

int main(){

    return 0;
}

module.cu

#ifdef __CUDACC__
#define CUDA_GLOBAL __global__
#else
#define CUDA_GLOBAL
#endif

#include <cuda.h>
#include "module.h"

CUDA_GLOBAL
void kernel(){

}

void call_kernel(){
    kernel<<<1,1>>>();
}

module.h

#ifndef _MODULE_H_
#define _MODULE_H_

#ifdef __CUDACC__
#define CUDA_GLOBAL __global__
#else
#define CUDA_GLOBAL
#endif

#include <numeric>
#include <cuda.h>

CUDA_GLOBAL
void kernel();

void call_kernel();

#endif
Joachim
  • 490
  • 5
  • 24
  • I don't get any errors running your test case on my machine. These errors sometimes come about due to a mixed machine configuration (multiple CUDA versions, with mixed usage). With `nvcc` it should not be necessary to specify things like: `-I/usr/local/cuda-11/include` and `-L/usr/local/cuda-11/lib64`. What is the output of `nvcc --version` on your machine? – Robert Crovella Dec 27 '21 at 15:50
  • @RobertCrovella the output of `nvcc --version` is `nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2020 NVIDIA Corporation Built on Mon_Nov_30_19:08:53_PST_2020 Cuda compilation tools, release 11.2, V11.2.67 Build cuda_11.2.r11.2/compiler.29373293_0` – Joachim Dec 27 '21 at 17:04
  • @RobertCrovella but the fact that it may be due to some multiple CUDA versions is possible. I had previously installed in the wrong way CUDA-10 and as I can see there are 2 folders under `usr/local`: cuda-10.1 and cuda-10.2, which don't seem to be active. I also checked at the output of `nvidia-smi` and the used cuda version is also 11.2 – Joachim Dec 27 '21 at 17:10
  • what is the result of running `which nvcc` ? – Robert Crovella Dec 27 '21 at 17:16
  • @RobertCrovella `/usr/local/cuda-11/bin/nvcc` – Joachim Dec 27 '21 at 17:18

2 Answers2

2

Your link line is wrong. All libraries (e.g., -lfoo) must come at the end of the link line after all the object files (e.g., .o files).

Not only that, but they need to be ordered properly (but I have no idea what the right order is so maybe they are correct above).

Almost all modern linkers are "single pass" linkers which means that they only go through the libraries one time, and since they only pull symbols in that they already need you must order your libraries with the "highest level" content first, and the "lower level" content following.

MadScientist
  • 92,819
  • 9
  • 109
  • 136
  • Thank you so much !! It worked. Indeed putting the linkages at the end does the trick. The "Highest level" order is I assume correct here, as this part comes from a course I took. – Joachim Dec 27 '21 at 17:47
0

Thanks to the explanations of MadScientist, the linkages must be done at the end of the link line, after all .o files:

INC := -I$(CUDA_HOME)/include -I.
LIB := -L$(CUDA_HOME)/lib64 -lcudart -lcurand

all: main

main: module.o
    mpicxx module.o main.cpp -o main $(INC) $(LIB)

module.o: module.cu module.h 
    nvcc -c -g module.cu -o module.o -I -dlink

clean: 
    rm -rf *.o
Joachim
  • 490
  • 5
  • 24
  • You should also put `main.cpp` before the libraries. Maybe it works in this case but it won't always work: it won't work the way you have it here if needed symbols are referenced in `main.cpp` rather than `module.o`. The libraries should always be _last_ on the link line. – MadScientist Dec 27 '21 at 17:53
  • 1
    Also I have no idea why you're setting `CUDAFLAGS` then never using that variable. – MadScientist Dec 27 '21 at 17:53
  • Alright, thanks for this additional explanation. I have edited my answer. `CUDAFLAGS` is a variable of my makefile I later use with additional modules. I have removed it. – Joachim Dec 27 '21 at 17:58
  • 1
    @Joachim: please stop using the term “modules” for what is being discussed in this question. Modules are a [specific feature](https://en.cppreference.com/w/cpp/language/modules) of C++20, and a separate, [specific feature](https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__MODULE.html#group__CUDA__MODULE) of the CUDA driver API. What you are talking about is neither of those things and are not “modules”. By doing so, you are just going to confuse people, including search engines that will lead future visitors to this question – talonmies Dec 29 '21 at 23:34
  • @talonmies I might have been influenced in a bad way coming from Python. I have updated part of my question, but I think I will leave the scripts as they are, if it's not too bad – Joachim Dec 30 '21 at 07:45