Code organization across files that has to deal with template functions and inlining

Question

I'm maintaining a large library of template classes that perform algebraic computations based on either float or double type. Many of the classes have accessor methods (getters and setters) and other functions that run small amounts of code, therefore such functions need to be qualified as inline when the compiler locates their definitions. Other member functions, in contrast, contain sophisticated code and thus would better be called rather than inlined.

A substantial part of the function definitions are located in headers, actually in .inl files included by headers. But there are also many classes whose function definitions happily live in .cpp files by means of explicit instantiation for float and double, which is rather a good thing to do in case of a library (here explained why). And finally, there is a considerable number of classes whose function definitions are broken across .inl files (accessor methods) and .cpp files (constructors, destructors, and heavy computations), which makes them all pretty difficult to maintain.

I would have all my class implementations in .inl files only if I knew a reliable way to prevent some functions from being inlined, or in .cpp files if inline keyword could strongly suggest compiler to inline some of the functions, which, of course, it does not. I would really prefer all the function definitions in the library to reside in .cpp files, but since accessor methods are used extensively throughout the library, I have to make sure they are inlined whenever referenced, not called.

So, in this connection, my questions are:

Does it make any sense to mark the definition of a template function with inline in view of the fact that, as I've recently learnt here, it is going to be automatically qualified as inline by the compiler regardless of whether it's marked with inline or not?
And most importantly, since I would like to have the definitions of all the member functions of a template class gathered together in a single file, either it's .inl or .cpp (using explicit instantiation in case of .cpp), preferably still being able to hint the compiler (MSVC and GCC) which of the functions should be inlined and which shouldn't, sure if such thing is possible with template functions, how can I achieve this or, if there is really no way (I hope there is), what would be the most optimal compromise?

----------

EDIT1: I knew that inline keyword is just a suggestion to the compiler to inline a function.

EDIT2: I really do know. I like making suggestions to the compiler.

EDIT3: I still know. It's not what the question is about.

----------

In view of some new information, there is also third question that goes hand in hand with the second one.

3. If compilers are so smart these days that they can make better choices about which function should be inlined and which should be called and are capable of link-time code generation and link-time optimization, which effectively allows them looking into a .cpp-located function definition at link time to decide its fate about being inlined or called, then maybe a good solution would be simply moving all the definitions into respective .cpp files?

----------

So what's the conclusion?

First of all, I'm grateful to Daniel Trebbien and Jonathan Wakely for their structured and well-founded answers. Upvoted both but had to choose just one. None of the given answers, however, presented an acceptable solution to me, so the chosen answer happened to be the one that helped me slightly more than others in making the final decision, the details of which are explained next for anyone who's interested.

Well, since I've always been valuing the performance of code more than how much convenient it is to maintain and develop, it appears to me that the most acceptable compromise would be to move all the accessor methods and other lightweight member functions of each of the template classes into the .inl file included by the respective header, marking these functions with inline keyword in an attempt to provide the compiler with a good hint (or with a keyword for inline forcing), and move the rest of the functions into the respective .cpp file.

Having all member function definitions located in .cpp files would hinder inlining of lightweight functions while unleashing some problems with link-time optimization, as has been ascertained by Daniel Trebbien for MSVC (in an older stage of development) and by Jonathan Wakely for GCC (in its current stage of development). And having all function definitions located in headers (or .inl files) doesn't outweigh the summary benefit of having the implementation of each class sorted into .inl and .cpp files combined with a bonus side effect of this decision: it would ensure that only the code of primitive accessor methods is visible to a client of the library, while more juicy stuff is hidden in the binaries (ensuring this wasn't a major reason, however, but this plus was obvious for anyone who is familiar with software libraries). And any lightweight member function that doesn't need to be exposed by the include files of the library and is used privately by its class can have its definition in the .cpp file of the class, while its declaration/definition is spiced with inline to encourage the inline status of the function (don't know yet whether the keyword should be in both places or just one in this particular case).

Simply searching for "inline" and "template" seems to turn up a lot of discussions already, where the issues are hammered out. For instance: http://stackoverflow.com/questions/10535667/does-it-make-any-sense-to-use-inline-keyword-with-templates — HostileFork says dont trust SE, Jul 10 '12 at 19:58
@HostileFork Thanks but, it's one of those rare SO pages where the top answers are contradicting.. Hopefully, the topic will have another chance to get some consensus here. — Desmond Hume, Jul 12 '12 at 17:04
..contradicting without giving any valuable info or sharing experience, really. — Desmond Hume, Jul 12 '12 at 17:54
I think I'm confused. Is the question "Does it make sense to straitjacket my compiler's inline strategy?" This is a question about optimization to which you're requesting "a credible and/or official" response. I think the only possible "credible and/or official" response to such a question would have to be "it depends." — Managu, Jul 13 '12 at 02:56
Or is the question "How do I specify a straitjacket for my compiler's inline strategy?" As has been pointed out, this can't be done in standard C++. — Managu, Jul 13 '12 at 03:00
1. I prefer to not use .inl files, instead write the 'small' functions in the headers and the implementation in the .cpps, then use explicit instantiation. I think that the additional burden of the definitions spread across multiple files is worth the, sometimes dramatic, compilation and linkage speed-up. 2. Your best bet is to trust the compiler, in which case there are two options: writing everything in .h, or write everything in .cpp and use global program optimization. — Yakov Galka, Jul 13 '12 at 11:58
3. `inline` is a suggestion according to the standard, but in practice some compilers (I know about MSVC for sure) *ignore it completely* for the purpose of optimization. — Yakov Galka, Jul 13 '12 at 12:01
@ybungalobill: link speed-up is definitely something you don't get by turning to whole-program-optimization. — Ben Voigt, Jul 13 '12 at 21:56
@BenVoigt: I haven't said that. I use what I described in no. 1, but seemingly the OP is more interested in "writing everything in one place", so no. 2 is his case. — Yakov Galka, Jul 14 '12 at 08:49

Ben Voigt · Answer 1 · 2012-07-14T14:50:56.983

8

In short: Put the template code in a header file. Use compiler-specific forceinline or noinline keywords if the optimizer fails to make good decisions about inlining.

You can and should put definitions of template members into header files. This ensures that the compiler has access to the definition at the point of use when it finds out what the actual template parameters are, and is able to perform implicit instantiaion.

The inline keyword has very little impact on templates, since template functions are already exempted from the single definition requirement (The One Definition Rule still requires that all definitions be the same). It is a hint to the compiler that the function should be inlined. And you can omit it as a hint to the compiler to not inline the function. So use it that way. But the optimizer will still look at other factors (function size) and make its own choice on inlining.

Some compilers have special keywords, like __attribute__(always_inline) or __declspec(noinline) to override the optimizer's choice.

Mostly, though, the compiler is smart enough not to inline "complex code that makes more sense as a function call". You shouldn't have to worry about it, just let the optimizer do its thing.

Portable inlining control isn't beneficial, because the trade-offs of inlining are very platform-specific. The optimizers should already be aware of those platform-specific tradeoffs, and if you do feel the need to override the compiler's choice, do so on a per-platform basis.

edited Jul 14 '12 at 14:50

answered Jul 10 '12 at 19:51

Ben Voigt

277,958
43
419
720

N.B. the GCC attribute is spelled `always_inline` – Jonathan Wakely Jul 13 '12 at 07:47
@BenVoigt This answer would be useful without imperative instructions like "Put the template code in a header file" and "You can and should put definitions into header files" but with argumentation on why I should choose headers over .cpp files. And no mentioning of explicit instantiation.. If you are bringing up compiler-specific `forceinline`, then I could use such a keyword for any function in a .cpp file, which would possibly have an effect with link-time code generation feature supported by some modern compilers and which is not covered in your answer. – Desmond Hume Jul 13 '12 at 12:39
@Desmond: The imperative advice is the answer to your question #2. Also, there are valid reasons for using explicit instantiation instead of implicit, but inlining control is not one of them. Explicit instantiation is totally off-topic for this Q&A. – Ben Voigt Jul 13 '12 at 13:12
In case if someone got great answers and is thinking whether it's worth posting them as I will accept this one anyways, please know that I'm not going to. – Desmond Hume Jul 13 '12 at 18:16
@DesmondHume: Clearly you aren't interested in an answer based in reality. You've already decided what you want the "correct" answer to be, and you're looking for some expert to agree with your rationale. I won't do that, and I couldn't even if I wanted to, because I don't know what your rationale was that you're looking for confirmation of. – Ben Voigt Jul 13 '12 at 21:54
@BenVoigt I was interested in an answer not only based in reality but also addressing at least one of my specific questions, not in some "TL;DR" answer that at the very beginning refers to the text of my question as "Too Long; Din't Read". – Desmond Hume Jul 13 '12 at 22:36
@DesmondHume: Well, I would feel bad about the fact you were insulted due to misunderstanding what "TL;DR answer" means... except that you've been belligerent since long before I edited that in. Every one of your numbered questions is addressed in my answer, it just didn't make sense to explain things in the order you asked. – Ben Voigt Jul 14 '12 at 00:12
#1 -- You asked: "Does it make any sense to mark the definition of a template function with inline?" I told you (second paragraph after the break) that you can use the inline keyword that way, but most compilers look at other factors instead. – Ben Voigt Jul 14 '12 at 00:14
#2 -- You asked "I would like to have the definitions of all the member functions of a template class gathered together in a single file"..."which of the functions should be inlined and which shouldn't"..."how can I achieve this?" I told you (first paragraph) that it's a best practice to put all the members together in a header. And I told you (third paragraph) the keywords that do control inlining (since `inline` really deals with One-Definition-Rule). – Ben Voigt Jul 14 '12 at 00:17
#3 -- I have no idea what "decide its faith about inlining" means. That's not how you use the word "faith". But I did tell you that the compiler has platform-specific knowledge, so in most cases you shouldn't try to override it. And I told you the correct way to override it. Which means that "no, trying to control inlining by moving things out of header files is NOT a good solution." Which was just a dumb question really, after your prior sentence showed you know that the compiler can still inline across compilation units. Apply some critical thinking here. – Ben Voigt Jul 14 '12 at 00:20
@BenVoigt I'm just not going to continue this, though I could give a good reply to each of the three points. But sure thank you for pointing at that accidental word misuse, it was really helpful, unlike your imperative TL;DR answers. And sorry for not sorting my questions in the order you would like them to be. And for asking a "dumb question" too, I really should have had all the answers in advance to avoid asking "dumb" questions, didn't have all the answers or all the confirmed info in advance, sorry. – Desmond Hume Jul 14 '12 at 12:28
@Desmond: Most of your question is quite reasonable. I'm just looking at what your #3 says, which is (to paraphrase): "I know that the compiler can perform cross-module inlining. Can I prevent inlining by moving functions to a .cpp file?" Perhaps that could use some additional clarification, because the way it reads now, you answered your own question. Finally, "TL;DR answer" doesn't mean I didn't read the question, it means I'm providing a two-sentence summary to some future reader who doesn't want to read my entire answer (mostly to people thinking about posting new answers for bounty). – Ben Voigt Jul 14 '12 at 14:47

score 5 · Accepted Answer · edited Jun 20 '20 at 09:12

1. Does it make any sense to mark the definition of a template function with inline in view of the fact that, as I've recently learnt, it is going to be automatically qualified as inline by the compiler regardless of whether it's marked with inline or not? Is the behavior compiler-specific?

I think you are referring to the fact that a member function defined in its class definition is always an inline function. This is per the C++ Standard, and has been since the first publication:

9.3 Member functions

...

A member function may be defined (8.4) in its class definition, in which case it is an inline member function (7.1.2)

So, in the following example, template <typename FloatT> my_class<FloatT>::my_function() is always an inline function:

template <typename FloatT>
class my_class
{
public:
    void my_function() // `inline` member function
    {
        //...
    }
};

template <>
class my_class<double> // specialization for doubles
{
public:
    void my_function() // `inline` member function
    {
        //...
    }
};

However, by moving the definition of my_function() outside of the definition of template <typename FloatT> my_class<FloatT>, it is not automatically an inline function:

template <typename FloatT>
class my_class
{
public:
    void my_function();
};

template <typename FloatT>
void my_class<FloatT>::my_function() // non-`inline` member function
{
    //...
}

template <>
void my_class<double>::my_function() // non-`inline` member function
{
    //...
}

In the latter example, it does make sense (as in, it's not redundant) to use the inline specifier with the definitions:

template <typename FloatT>
inline void my_class<FloatT>::my_function() // `inline` member function
{
    //...
}

template <>
inline void my_class<double>::my_function() // `inline` member function
{
    //...
}

2. And most importantly, since I would like to have the definitions of all the member functions of a template class gathered together in a single file, either it's .inl or .cpp (using explicit instantiation in case of .cpp), preferably still being able to hint the compiler (MSVC and GCC) which of the functions should be inlined and which shouldn't, sure if such thing is possible with template functions, how can I achieve this or, if there is really no way (I hope there is), what would be the most optimal compromise?

As you know, the compiler may elect to inline a function, whether or not it has the inline specifier; the inline specifier is just a hint.

There is no standard way to force inlining or prevent inlining; however, most C++ compilers support syntactic extensions for accomplishing just that. MSVC supports a __forceinline keyword to force inlining and #pragma auto_inline(off) to prevent it. G++ supports always_inline and noinline attributes for forcing and preventing inlining, respectively. You should refer to your compiler's documentation for details, including how to enable diagnostics when the compiler is unable to inline a function as requested.

If you use those compiler extensions, then you should be able to hint to the compiler whether a function is inlined or not.

In general, I recommend to have all "simple" member function definitions gathered together in a single file (usually the header), by which I mean, if the member function does not require very many more #includes above the set of #includes required to define the classes/templates. Sometimes, for example, a member function definition will require #include <algorithm>, but it is unlikely that the class definition requires <algorithm> to be included in order to be defined. Your compiler is able to skip over function definitions that it does not use, but the larger number of #includes can noticeably lengthen compile times, and it is unlikely that you will want to inline these non-"simple" functions anyway.

3. If compilers are so smart these days that they can make better choices about which function should be inlined and which should be called and are capable of link-time code generation and link-time optimization, which effectively allows them looking into a .cpp-located function definition at link time to decide its fate about being inlined or called, then maybe a good solution would be simply moving all the definitions into respective .cpp files?

If you place all of your function definitions into CPP files, then you will be relying on LTO for mostly all function inlining. This may not be what you want for the following reasons:

At least with MSVC's LTCG, you give up the ability to force inlining (See inline, __inline, __forceinline.)
If the CPP files are linked to a shared library, then programs linking with the shared libraries will not benefit from LTO inlining of library functions. This is because the compiler intermediate language (IL)—the input to LTO—has been discarded and is not available in the DLL or SO.
If Under The Hood: Link-time Code Generation is still correct, "calls to functions in static libraries can't be optimized".
The linker would be performing all inlining, which might be a lot slower than having the compiler perform some inlining at compile time.
The compiler's LTO implementation might have bugs that cause it to not inline certain functions.
Use of LTO might impose certain limitations on projects using your library. For example, according to Under The Hood: Link-time Code Generation, "precompiled headers and LTCG are incompatible". The /LTCG (Link-time Code Generation) MSDN page has other notes, such as "/LTCG is not valid for use with /INCREMENTAL".

If you keep the likely-to-be-inlined function definitions in the header files, then you could use both compiler inlining and LTO. On the other hand, moving all function definitions into CPP files will restrict compiler inlining to only within the translation units.

An elaborate answer to the question #1, thank you. I've also added question #3, in case if you are interested. — Desmond Hume, Jul 13 '12 at 17:27

Jonathan Wakely · Answer 3 · 2012-07-13T17:40:27.747

I don't know where you learnt that, but templates are not "automatically qualified as inline by the compiler regardless of whether it's marked with inline or not". Templates and inline functions both have what is sometimes called "vague linkage" meaning their definitions can be present in multiple objects without error and the linker will use one of the definitions and discard the others. But the fact templates and inline functions both have vague linkage doesn't mean templates are automatically inline. Lions and tigers are both big cats but that doesn't mean lions are tigers.
Unless you know all the instantiations you are using in advance you can't always use explicit instantiation e.g. if you're writing a template library for others to use then you can't provide all the explicit instantiations, so you must define the template in .h (or .inl) files that the user of the code can #include. If you do know all the instantiations in advance then using explicit instantiations in .cpp files has the advantage of improving compilation time, because the compiler only instantiates the templates once in the file containing the explicit instantiations, not in every file that uses them. But that has nothing to do with inlining. For a function to be inlined its definition must be visible to the code calling it, so if you only define function templates (or member functions of class templates) in a .cpp file then they can't be inlined anywhere except in that file. If you define them in a .cpp file and do qualify them as inline then you might cause problems trying to call them from other files, which can't see the inline keyword (if a function is declared inline in one translation unit it must be declared inline in all translation units in which it appears, [dcl.fct.spec]/4.)
For what it's worth, I don't generally bother using .inl files, I just define templates directly in .h files, which gives one less file to deal with. Everything's in one place, and it just works, all files that use the templates can see the definitions and choose to inline them if desired. You can still use explicit instantiations in that case too, to improve compilation time and reduce object file size, without sacrificing inlining opportunites.
Why would that be better than just defining your template code in headers, where it belongs? What exactly are you trying to achieve? If it's fewer files, put the template code in headers, that will always work, the compiler can choose to inline everything without needing LTO, and you only have one file per class template (and you can still use explicit instantiation to improve compilation times). If you're trying to move all your code into .cpp files (which I think you're focusing on too much) then go ahead and do it. I think it's a bad idea, and will probably cause problems (link-time optimisation still has issues with the only compiler I've tried using it with, and certainly won't make compilation any faster) but if that's what you want, do whatever floats your boat.

It seems like your questions revolve around a misunderstanding here:

I would have all my class implementations in .inl files only if I knew a reliable way to prevent some functions from being inlined,

If all your template definitions are in header files you don't need "a reliable way to prevent some functions from being inlined" ... as I said above, templates are not automatially inline just because they're in headers, and if they're too large to inline the compiler won't inline them. First problem solved. Secondly:

or in .cpp files if inline keyword could strongly suggest compiler to inline some of the functions, which, of course, it does not, especially if a function marked with inline is located in a .cpp file.

As I said above, a function marked inline in a .cpp file is ill-formed unless it's also marked inline in the header, and never used in any other .cpp file. So doing this is just making life difficult and possibly causing linker errors. Why bother.

Again, all signs point to just put your template definitions in headers. You can still use explicit instantiation (as GCC does for std::string, as mentioned in the post you link to) so you get the best of both worlds. The only thing it doesn't achieve is hiding the implementations from users of the templates, but it doesn't sound like that's your aim anyway, if it is then provide non-template function API, which can be implemented in terms of templates in a single .cpp file.

Well, what I'm trying to achieve is the classical separation of the function's declaration from its implementation. And about 95% of template function definitions would really better be hidden from the client program/user of my library. But it's enough of the first reason alone for me to think in the .cpp direction.. — Desmond Hume, Jul 13 '12 at 18:03
`.inl` files are just headers by another name. It can certainly help make the code clearer to split a header into two, for declaration and definition, and it doesn't matter if you call it `.inl` or `.tcc` or something else. Personally I often just put the definitions at the bottom of the file, after all the declarations. But however you do it you can't fully get the classical separation of interface and implementation with templates, due to the requirement for template implementations to always be visible. — Jonathan Wakely, Jul 13 '12 at 18:37
But you see, there is no requirement for template implementations to be always visible provided that the template class is explicitly instantiated. However, if you say that you've had issues with link-time optimization, then probably .inl files is the way to go, I'm just curious what issues they were, with what compiler, and could it be that the present day compilers are already free of such issues. — Desmond Hume, Jul 13 '12 at 18:49
But as I said in my answer, it's not always true (in fact I'd say it's uncommon) that authors of templates know in advance all the types the template will be instantiated with, so _in general_ you cannot rely on explicit instantiations allowing you to only put template definitions in `.cpp` files. — Jonathan Wakely, Jul 13 '12 at 19:10
I meant issues with LTO such as linker errors, due to compiler and/or linker bugs, with the latest versions, e.g. like [this](http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53604) or [this](http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53895). LTO adds an extra level of complexity and stresses the toolchain significantly more than without it. — Jonathan Wakely, Jul 13 '12 at 19:13
(Just a side note, not to start any discussion: It's maybe uncommon that authors of templates know in advance all the types the template will be instantiated with, but, in the context of my library, it's a very firm knowledge since, as I mentioned (probably not clear enough), it's only `float` and `double` the classes can be templatized with.) — Desmond Hume, Jul 14 '12 at 13:01
(Oh, and I edited the question to cite the [source](http://stackoverflow.com/questions/11416747/multiple-definitions-of-a-non-template-class-vs-a-template-class) that made the think that template functions are automatically inline. Wrong assumption ofc. and Mark Ransom was't very sure when expressed his belief that they are.) — Desmond Hume, Jul 14 '12 at 14:02

score 1 · Answer 4 · answered Jul 14 '12 at 07:12

This is not a complete answer.

I read that clang and llvm are able to do very comprehensive link time optimization. This includes link time inlining! To enable this, compile with optimization level -O4 when using clang++. The object files will be llvm bytecode instead of machine code. This is what makes this possible. This feature should therefore allow you to put all of your definitions in the cpp files, knowing that they will still be inlined where necessary.

Btw, the length of a function body is not the only thing that determines whether it will be inlined. A lengthy function that is only called from one location can easily be inlined at that location.

Code organization across files that has to deal with template functions and inlining

----------

----------

----------

4 Answers4

Linked