C++ `inline` keyword and compiler optimization

Question

I keep hearing that the inline keyword is not useful as a hint for modern compiler anymore but is used to avoid the multiple definition error in the multi-source project.

But today I encountered an example that compiler obeys the keyword.

Without inline keyword, the following code

#include <iostream>

using namespace std;

void func(const int x){
    if(x > 3)    
        cout << "HAHA\n";
    else
        cout << "KKK\n";
}

int main(){
    func(5);
}

with the command g++ -O3 -S a.cpp, generates the assembly code with the func is not inlined.

However if I add inline keyword in front of the definition of func, the func is inlined into main.

The part of the generated assembly code is

.LC0:
    .string "HAHA\n"
.LC1:
.string "KKK\n"
.text
.p2align 4,,15
.globl  _Z4funci
.type   _Z4funci, @function
_Z4funci:
.LFB975:
    .cfi_startproc
    cmpl    $3, %edi
    jg  .L6
    movl    $4, %edx
    movl    $.LC1, %esi
    movl    $_ZSt4cout, %edi
    jmp _ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l
    .p2align 4,,10
    .p2align 3

main:
.LFB976:
    .cfi_startproc
    subq    $8, %rsp
    .cfi_def_cfa_offset 16
    movl    $5, %edi
    call    _Z4funci
    xorl    %eax, %eax
    addq    $8, %rsp
    .cfi_def_cfa_offset 8
    ret
    .cfi_endproc

My compiler is gcc 4.8.1 / x86-64.

I suspect that the function can be inlined during the linking process but I am not sure that will happen and if so, how can I know?

My question is why this code snippet seems to be contradictory to the modern guideline such as When should I write the keyword 'inline' for a function/method?

You've shown that `inline` might influence the decision whether to inline, not that it's better in any way. — , Sep 21 '13 at 11:56
It does inline with either `static` declaration or `-flto` flag. — zch, Sep 21 '13 at 11:57
The highest rated answer to the question you quote is simply false. Ignore it. — James Kanze, Sep 21 '13 at 11:58
The compiler cannot know whether you might want to use the function elsewhere, since you're only compiling one single, unlinked translation unit. Compiling a whole program (e.g. with `-fwhole-program` or `-flto`) changes that, as does giving the function internal linkage (anonymous namespace). In the original case, once the compiler has produced the external function body, it considers it more expedient to call that function rather than duplicate the code in the main function. — Kerrek SB, Sep 21 '13 at 11:59
@KerrekSB What does whether the function is used elsewhere or not have to do with whether it is inlined in this translation unit? — James Kanze, Sep 21 '13 at 12:01
@JamesKanze: Avoid duplication when there's already a function definition around? — Kerrek SB, Sep 21 '13 at 12:01
What I find more disappointing is that GCC doesn't emit a `cmov` instruction for the printing. Conditional operator to the rescue... — Kerrek SB, Sep 21 '13 at 12:02
@JamesKanze you mean, what does duplicated code and larger executable size have to do with whether or not the compiler decides to inline? Seems fairly significant to me. :) — jalf, Sep 21 '13 at 12:03
@jalf I'm afraid I don't understand your comment. Why would there be significant duplicate code depending on whether the compiler generates the function inline or not? — James Kanze, Sep 21 '13 at 12:07
@JamesKanze It has to generate a proper callable function in any case because of linkage. If it also inlines, it has to duplicate the function body into `main`. If it doesn't inling, there is no duplication, only a `call` to the function that exists anyway. — , Sep 21 '13 at 12:12
@KerrekSB Regarding cmov, why would you want it to emit a cmov? I haven't benchmarked myself, but kernel wisdom seems to be that it's rarely ever worth it: http://yarchive.net/comp/linux/cmov.html — , Sep 21 '13 at 12:14
@delnan: Thanks for the link. I don't know, folklore superstition about the cost of branching, I suppose... [Update:] That said, for the present situation the two branches are both constants, so nothing needs to be computed. I think a conditional move to pick one out of two constants is still a good deal, non? — Kerrek SB, Sep 21 '13 at 12:15
@JamesKanze In which way is said answer false? Simply telling us that it is conveys no useful information and doesn't make you seem more plausible to be correct! I guess you're referring to your answer below? — underscore_d, Feb 10 '16 at 19:54

score 3 · Accepted Answer · answered Sep 21 '13 at 12:11

The inline keyword has several effects. One of which is to hint to the compiler that you want the function to be inlined - however, that doesn't mean the compiler HAS to inline it [there's an extension in several compilers that says "inline this no matter what, if at all possible", such as MS's __forceinline and gcc's __attribute__(always_inline)].

The inline keyword also has allows you to have multiple instances of a function with the same name if the function is inlined, without getting errors for "multiple definitions of the same function". [But the function must be the same source each time].

In this case, I'm a little surprised to see the compiler NOT inline func. However, adding static to func makes it go inline too. So clearly the compiler decides this based on the fact that "some other function may be using func too, so we need a copy anyway, and there isn't much gain from inlining it. In fact, if you make a function static, and it's only called once, even if the function is very large, gcc/g++ will almost certainly inline it.

If you want the compiler to inline something, it never hurts to add inline. However, in many cases, the compiler will make a decent choice either way. For example, if I change the code to this:

const char* func(const int x){
    if(x > 3)    
        return "HAHA\n";
    else
        return "KKK\n";
}

int main(){
    cout << func(5);
}

it does inline the return "HAHA\n"; part that is left of func.

The compiler's logic to decide to inline or not inline is complex, and part of that is "how much do we gain, vs how much more code-space does it take up" - it's likely that the overhead of calling operator<<(ostream& ,const char *) was too much for the inliner in this case. Unfortunately, it's not always easy to understand why the compiler takes a certain decision...

score 2 · Answer 2 · answered Sep 21 '13 at 12:01

First, it is not so black or white. The only absolute effect of the inline keyword is to suppress the ODR rule and avoid multiple definition errors. Beyond that, the compiler is certainly free to take the keyword as a hint about inlining, but it may or may not do so. (And from what I have seen, in practice the compiler generally does ignore this optimization hint, because most people have no clue how often to inline, or what to inline, and the compiler can just do a better job of it). But it doesn't have to ignore the hint.

Second, there could well be another reason why the call is inlined with the inline keyword but not without.

Without the inline keyword, the function definition has to be exported, as another TU might need to link to it. And since we have to export the function definition, the code is there already, and inlining the call would just mean you effectively had the function body duplicated. More total code, larger executable size, a hit to instruction cache locality.

But with the inline keyword, the compiler doesn't have to export the function definition, so it can inline the call and entirely remove the original definition. Then the total code size doesn't increase (instead of generating the function definition and a call to it, we just move the function body to the call site).

As an experiment, try marking the function as static instead of inline. That also means the compiler doesn't have to export the definition, and very likely, that will also result in it deciding that inlining is worthwhile.

I'm not sure what you mean by "the compiler generally does ignore this optimization hint". From a QoI point of view, it should only ignore it if it can do a better job than the programmer, and this isn't the case with the most common compilers. And as the question points out, g++ does _not_ ignore it; from what I've seen, nor does VC++. — James Kanze, Sep 21 '13 at 12:09
Re the sentence "most people have no clue how often to inline": I would hope that no one inlines anything until they have a performance problem. The clue comes from the actual performance of the program. (Otherwise, it's premature optimization.) — James Kanze, Sep 21 '13 at 12:29
Re your third paragraph: in my experience, most of the functions I've declared inline for performance reasons are in the unnamed namespace. You want to avoid inline in a header, precisely because it breaks encapsulation, and introduces compiler dependencies. — James Kanze, Sep 21 '13 at 12:31

score 1 · Answer 3 · answered Dec 20 '18 at 03:48

Today (in 2018), the inline attribute is still used for optimization. Even in modern compilers.

The claim that they would ignore it and instead purely rely on their own cost models is not true, at least in the open source compilers GCC and Clang. Simon Brand wrote a nice blog post about it (Do compilers take inline as a hint?), where he debunked it by looking at the compiler's source code.

But it is not that these compiler will blindly follow the programmer's hint. If they have enough evidence that it will hurt performance, they will overrule you.

There are vendor specific extensions that will force inlining, even if the compilers thinks it is a bad idea. For example, in Visual Studio it is called __forceinline:

The __forceinline keyword overrides the cost/benefit analysis and relies on the judgment of the programmer instead. Exercise caution when using __forceinline. Indiscriminate use of __forceinline can result in larger code with only marginal performance gains or, in some cases, even performance losses (due to increased paging of a larger executable, for example).

GCC and Clang call it inline __attribute__ ((__always_inline__)).

In general, trusting the compiler with the decision is recommended, especially if you can use profile-guided optimization. One notable exception of a high quality code base that uses forced inlining is Boost (look for BOOST_FORCEINLINE).

score 0 · Answer 4 · answered Sep 21 '13 at 11:56

0

What you keep hearing is false, or should be. The standard clearly specifies the intent of inline: tell the compiler that it would be preferable if the compiler could generate this code inline. Until compilers can do a better job than the programmer of judging when inlining is necessary, it shoud take the "hint" into account. Maybe some day, inline will become irrelevant for this (like register has become), but we're far from there yet.

Having said that, I'm very surprised that g++ didn't inline in your case. g++ is usually fairly aggressive about inlining, even when the function isn't marked inline. Maybe it just figured that since the function wasn't in a loop, it wasn't worth the bother.

answered Sep 21 '13 at 11:56

James Kanze

150,581
18
184
329

2

Compilers typically *are* better at deciding when to inline. That does not mean there are no cases where a human experts aren't better at it, but the same is true of register allocation - just [ask Mike Pall](http://article.gmane.org/gmane.comp.lang.lua.general/75426). And there's always compiler-specific "never inline" and "always inline if you can in any way" attributes. – Sep 21 '13 at 11:59
1

Can you show us a case where the `inline` optimization hint *does* cause the compiler to inline something it wouldn't otherwise have inlined? I'm not aware of such a situation (as mentioned in my answer and in comments, `static` typically causes the compiler to inline as well, because it is not the hint that is significant, but the fact that it can avoid emitting code for the function definition. The *hint*, by itself, does not make a difference in any case I have seen. I'd love to see if you know of such a case. – jalf Sep 21 '13 at 12:05
@delnan That is simply false. Compilers can come close _if_ they do whole program optimization based on profiler output; this is not the usual case, however. And neither g++ nor VC++ ignore the `inline` declaration. Inline is _not_ like `register`. – James Kanze Sep 21 '13 at 12:05
I agree that `inline` is not like `register`, but for other reasons (semantically, one just restricts what you can do while the other allows you to write code that's more amendable to inlining). And yes, it's not completely ignored. But as I already wrote in my very first comment on the question, there is a difference between "you can force inlining with it" and "using it to force inlining is useful". – Sep 21 '13 at 12:10
@jalf The question presented exactly such case. – James Kanze Sep 21 '13 at 12:10
@delnam You can't use it to force inlining (although most compilers will respect it most of the time), but the hint is certainly useful. It regularly improves performance for us. (The usual rule for optimization applies here: nothing is declared inline until we have a performance problem.) – James Kanze Sep 21 '13 at 12:26
@JamesKanze the question presented is *not* such a case, as I indicated in previous comments. There are many cases where the compiler will inline *if it determines that there is no downside*. Such cases include "a function marked `inline` which is only ever called once". By inlining it you do not grow the overall executable size (you can eliminate the original definition, and only retain the inlined version), *and* you get better instruction cache locality and save a few jumps. It's a net win, regardless of any hint given. – jalf Oct 02 '13 at 07:11
If you simply remove `inline`, then sure, you remove the hint, but you also remove the guarantee the compiler previously had, that the original definition could be eliminated - now, someone else could declare it `extern` and call it, *so the definition must be there*. And so, inlining would require duplicating code - it becomes less beneficial, and the compiler won't do it. Again, without needing to look at any hints you did or did not give it – jalf Oct 02 '13 at 07:13
Now, the interesting part: replace `inline` with `static` -- I hope we agree that `static` was never intended as a hint that the function should be inlined -- then the compiler will **also** inline the function (at least with the compilers I've tested), because as in the original case, it is guaranteed that the function will never be called from another TU, and so, the original definition can be eliminated, avoiding the downside to inlining. In short, this behavior is consistent with what any optimizing compiler should do, even if it disregarded the "hint" aspect of `inline`. – jalf Oct 02 '13 at 07:15
It is certainly possible that the compiler *also* looked at the hint and would have decided to inline even if it had had downsides. But this example doesn't show it, because it selects a case where inlining is objectively the *better* choice, by every single metric. That's why I asked if you know of a case where we can see for sure that the *hint* aspect of `inline` makes a difference. Perhaps one where `inline` causes inlining, and `static` does not? – jalf Oct 02 '13 at 07:18
@jalf With or without the `inline`, the compiler may or may not have to generate an out of line copy. `inline` has no impact here. – James Kanze Oct 02 '13 at 08:54
@JamesKanze: no. First, you can check the disassembly if you don't believe me, and second, consider what `inline` *means*: every translation unit which *calls* the function is required to also *contain* the function definition. Therefore, if this is the only TU to call the function, then it can just remove the definition. If others call it as well, then it follows that they must contain an identical definition - and if they do, then they didn't need ours, so we can remove it freely – jalf Oct 02 '13 at 08:57
@jalf If there aren't such cases with g++ or VC++, then the compiler is seriously defective. Neither has particularly good analysis of when inlining is effective or not. – James Kanze Oct 02 '13 at 09:00

C++ `inline` keyword and compiler optimization

4 Answers4