4

Using CUDA 5.0 on ubuntu with gcc/g++ 4.6, I'm getting errors when linking against CUDA code with templates.

cu_array.cu:

#include "cu_array.hpp"
template<class T>
CuArray<T>::CuArray(unsigned int n) {
  cudaMalloc(&data,n*sizeof(T));
}

cu_array.hpp:

#pragma once
template<class T>
class CuArray {
public:
   CuArray(unsigned int n);
private:
  T* data;
};

main.cu:

#include "cu_array.hpp"
int main() {
  CuArray<float> a(10);
}

These compile fine with nvcc -c, but linking with nvcc cu_array.o main.o gives undefined reference to CuArray<float>::CuArray(unsigned int). If I move the contents of cu_array.cu into the header and only build the main, it uses the templates just fine. Or if I remove the templates altogether, the code naturally links fine.

I'm sure there's a simple answer for this. Any ideas?

Aurelius
  • 1,146
  • 2
  • 13
  • 25
  • 1
    possible duplicate of [this](http://stackoverflow.com/questions/1724036/splitting-templated-c-classes-into-hpp-cpp-files-is-it-possible) Probably the easiest solution is given in the second answer there, which is to add something like `template CuArray::CuArray(unsigned int);` at the end of cu_array.cu This is not specific to CUDA – Robert Crovella Jun 03 '13 at 18:49

1 Answers1

3

You haven't instantiated the class in the compilation unit where it is defined, so the compiler doesn't emit any code for the class member function, and linkage fails. This isn't specific to CUDA, this greedy style of instantiation is the compilation/linkage model g++ uses, and lots of people get caught out by it.

As you have found already, the simplest solution is to include everything into the same compilation unit, and the problem disappears.

Otherwise if you explicitly instantiate CuArray::CuArray at the bottom of cu_array.cu like this:

template CuArray<float>::CuArray(unsigned int);

the compiler will emit code where it would otherwise not, and the linkage problem will be fixed. You will need to instantiate every class function for every type you want to use elsewhere in the code to make this approach work.

talonmies
  • 70,661
  • 34
  • 192
  • 269