2

If I have a template function in a seperate compile unit (which produces an object file with suffix .o from the CUDA C compiler NVCC)

lets say we have the definition (implementation)

template<typename T> 
void foo(T a){
   // something
}

To produce the explicit code in the object file to be able to link to it from another compilation unit I need to explicit instantiate this template (for all template parameters I need) this:

template void foo<double>(double a);
template void foo<float>(double a);

Doing this results in actual code in the object file.

Doing the other thing like:

template<> void foo<double>(double a);
template<> void foo<float>(float a);

Does not produce code in the object file, because this is a full spezialized template declaration. Is this correct?

Also

void foo(double a);
void foo(float a);

does not produce code because this would be a overload declaration ? is this correct?

The question is now, what is the general syntax to make the compiler produce code for a template function or class in a separate compilation unit?

Gabriel
  • 8,990
  • 6
  • 57
  • 101
  • 1
    Your full specialized template declaration is just a *declaration*. There is no definition there, hence there is no code. It is not an instantiation. (This is unrelated with CUDA btw). – user703016 May 27 '14 at 14:20
  • You are correct that it doesn't "produce code". To "produce code" when compiling a separate compilation unit you need to have access to the template declaration and definitions at compile-time (not link time), from here all the discussion about templates in headers or explicit instantiations. So to have the compiler produce code for a template function or class **when compiling** another compilation unit you need to have access to declaration/definition and then **use it** or **explicitly instantiate it** into that unit. Always firmstanding the ODR. – Marco A. May 27 '14 at 14:52

1 Answers1

3

In laymans terms, when you write this:

template void foo<double>(double a);
template void foo<float>(double a);

You are explicitly telling the compiler to instantiate the function template with the correct template arguments, so you get an implementation of foo<double> and foo<float> just as if you had copy pasted the code from the function template and replaced T with double and float.

On the other hand, when you write this:

template<> void foo<double>(double a);
template<> void foo<float>(float a);

You are telling the compiler that foo<double> and foo<float> are completely different things that have no relation whatsoever with foo<T>. This is called specialization. However, you are not providing a definition for these specializations, only declarations: you are merely telling the compiler that these things exist, but not what they are. A definition of a specialization would look like this:

template<>
void foo<double>(double a) {
    // something else
}

Depending on your intent, you may want to either:

  • Use explicit instantiation (if foo<double> and foo<float> share the same implementation)
  • Use different specializations, while providing actual definitions for these specializations.

I am guessing you want the first.

Griwes
  • 8,805
  • 2
  • 43
  • 70
user703016
  • 37,307
  • 8
  • 87
  • 112