How are templated C++ functions in headers sent to the linker?

Question

Suppose we have this header, with a function:

//template<int does_nothing=0> // <-- !! links correctly if uncommented
void incAndShow() {
  static int myStaticVar = 0;
  std::cout << ++myStaticVar << " " << std::endl;
}

If this is included in several .cpp files, you'll get a linker error due to the redefinition of incAndShow().

Now, if you uncomment the //template<int does_nothing=0> line above, this code will compile correctly with no linker error. And, the static variable will be shared correctly in all instances of incAndShow<0>().

This is because the C++ standard explicitly states that this should be allowed in its "One Definition Rule" (Page 43). The abbreviated text is:

There can be more than one definition of a... non-static function template (17.5.6)... provided that each definition appears in a different translation unit...

Now, note that the above is only true for templated functions, and not for non-templated functions. This is because, at least in the traditional paradigm, non-templated functions are compiled individually within each translation unit, and then the linker, envisioned as a totally separate process from the compiler, sees multiple definitions of the same symbol in different .obj files and throws the error.

But somehow, this doesn't happen for templated functions.

So my question is: how, in a low-level sense, does this work?

Does the compiler just compile each translation unit independently, meaning it generates multiple versions of the same symbol within each translation unit, but then have some special way to tell the linker to not worry about it?
Does the compiler, on-the-fly, keep a note of the first time some templated function definition is seen in some translation unit, and automatically ignore upon seeing the same definition again when compiling the next translation unit?
Does the compiler do some kind of "pre-compilation process," before compiling anything, where it goes through each file and removes duplicate templated code?
Something else?

This has been brought up on StackOverflow a few times, but this is the first time I've seen anyone ask how the compilation process really happens on a low-level:

Static variable inside template function (where my example came from)
Why C++'s <vector> templated class doesn't break one definition rule?

Templates is a pure compile-time thing, and the linker is not involved when using templates. What happens without the `template`, is that the function simply is *defined* in each [translation unit](https://en.wikipedia.org/wiki/Translation_unit_(programming)) that includes the header file (breaking the [one definition rule](https://en.cppreference.com/w/cpp/language/definition)). When you add `template` you no longer define a function, but a template of a function, which can then be instantiated in a translation unit. — Some programmer dude, Dec 05 '20 at 03:33
[This `template` reference](https://en.cppreference.com/w/cpp/language/templates) might be helpful to read. As well as [Why can templates only be implemented in the header file?](https://stackoverflow.com/questions/495021/why-can-templates-only-be-implemented-in-the-header-file) And of course refresh the sections about templates in your text books. — Some programmer dude, Dec 05 '20 at 03:37
You get the same thing with non-template functions that are implicitly or explicitly inline. The compiler may inline the code, may generate a function that the linker sees, or both. When the linker puts it all together, if it needs a function somewhere that is defined in multiple translation units, it will pick one of those definitions. — 1201ProgramAlarm, Dec 05 '20 at 03:37
@Someprogrammerdude, thanks for the reference. It *does* seem like it's all about the linker, at least from what cppreference says: "At link time, identical instantiations generated by different translation units are merged." So the linker would normally throw a redefinition error if this happens, but if the function is templated, knows to instead just merge the definitions. So it looks like my first suggestion is the way they do it. — Mike Battaglia, Dec 05 '20 at 03:55
Strangely, at least on clang, it will even merge *non-identical* instantiations. That is, if you have `template void foo(){std::cout << 1}` in one translation unit, and `template void foo(){std::cout << 2}` in another, it seems to just pick one (randomly?) and use that for all without throwing any errors. — Mike Battaglia, Dec 05 '20 at 03:57
@MikeBattaglia: That breaks one definition rule (ODR), the program is ill formed, no diagnostic required. — Jarod42, Dec 05 '20 at 08:39
@MikeBattaglia -- "merged" in that quote is misleading (although that's how it's often described). The linker picks one version of the function and uses that for all; nothing gets combined, which is why "merged" is misleading. And it doesn't matter whether they are identical; if they're not, you've got an ODR violation, and the linker might tell you that there's a problem, or it might not. If it doesn't, you've got a debugging nightmare, because you'll be looking at one definition of the function while the code is calling a different one. Yes, I've been there... — Pete Becker, Dec 05 '20 at 13:57

How are templated C++ functions in headers sent to the linker?

0 Answers0