77

Can/does the compiler inline lambda functions to increase efficiency, as it might with simple standard functions?

e.g.

std::vector<double> vd;
std::for_each(vd.begin(), vd.end(), [](const double d) {return d*d;});

Or is there loss of efficiency caused by lack of optimisation?

A second question: where I can check if the compiler I use has optimised calls of inline functions, which are sent to an algorithm? What I mean is, if a function—not a function object—is sent to an algorithm, the last one gets a pointer to the function, and some compilers optimize pointers to inline functions and others don't.

Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
Illia Levandovskyi
  • 1,228
  • 1
  • 11
  • 20
  • 6
    Some are optimized, some are not, like any function call. If you are interested in a specific call, you need to check what your specific compiler does with that specific call. – n. m. could be an AI Apr 10 '13 at 16:11
  • 5
    You’re confusing concepts here. All lambdas are inline. Not all calls to them are necessarily inlined. – Konrad Rudolph Apr 10 '13 at 17:14
  • I don't think a lambda can be inlined if it is passed to an external function. – Brent Bradburn May 06 '14 at 16:50
  • 2
    Konrad Rudolph@ Don't confuse ppl, the code inline is not the same as the code inplace. All lambdas definitely are NOT inline. But all lambdas definitely are inplace. The `inline` is reserved for the compiler specific optimization which is not much correlated with the inplace coding like lambdas does. – Andry Oct 22 '17 at 13:08
  • @Andry You got that completely backwards. Read my answer below for a correct explanation. – Konrad Rudolph Jun 28 '19 at 14:21

3 Answers3

61

First off: the whole point of the design of lambdas in C++ is that they don’t have an overhead compared to function calls. That notably includes the fact that calls to them can be inlined.

But there’s a confusion of concepts here: in the C++ standard, “inline” is a property of a function, i.e. it is a statement about how a function is defined, not how it gets called (in particular, it permits multiple identical definitions of the same name in multiple translation units). Functions that are defined inline can benefit from a compiler optimisation by which calls to such functions are inlined. It’s a different (though closely related concepts).

In the case of lambdas, the actual function being called is a member operator() that is implicitly defined as inline in an anonymous class created by the compiler for the lambda. Calls of the lambda are translated to direct calls to its operator() and can therefore be inlined. I’ve explained how the compiler creates lambda types in more detail in another answer.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 4
    Funny enough this is one of my most downvoted answers, yet nobody has explained what’s wrong with it. – Konrad Rudolph Jul 26 '18 at 09:25
  • 5
    I think this should be at the top, the fact that it is called as a member operator() makes it very easy to set the expectation regarding inlining. – rahenri Nov 01 '18 at 17:13
  • 1
    Indeed, it's funny to see that the accepted and most upvoted answer actually refers to an answer that you wrote, while your answer here, which is similar to the linked one, is being downvoted. Actually I think yours (choose which :-P) is more to the point... – andreee Jun 28 '19 at 14:19
  • Simple enough the lambdas in C++ should have overhead, otherwise how then the everything in them are hidden up to the simple `operator()`? You miss leading ppl to the idea what the false alternative. The idea of lambdas is not in the overhead, the idea was to put the code as data there where before it could not been sended and later call it on demand. That's all. – Andry Jun 28 '19 at 17:03
  • In second, the `inline` is not a need or requirement to acquire the inlining. The most but not the least requirement is code visibility through all casts and conversions from the definition to the usage point and even might be just a bit futher. The `inline` has nothing to do with that. – Andry Jun 28 '19 at 17:08
  • 4
    @Andry Again, you have no idea what you’re talking about, and your first comment in particular makes no sense whatsoever. You are getting *fundamental* facts about C++ (in particular about linkage) wrong, and until you learn more about the basics of the language and the compiler/linker architecture you won’t understand how lambdas work in C++. – Konrad Rudolph Jun 30 '19 at 11:14
  • 1
    @cz You’re right, I keep incorrectly calling the inline“-ness” of a function “linkage”. I think my issue is that there’s no proper term (equivalent to “linkage”) in the standard that describes a function’s inline-ness so I misappropriated the term. I’ve fixed the phrasing in my answer. – Konrad Rudolph Aug 09 '23 at 12:21
49

In simple cases, like your example, you should expect better performance with lambdas than with function pointers, see

Why can lambdas be better optimized by the compiler than plain functions?

As others have already pointed out, there is no guarantee that your call will be inlined but you have better chances with lambdas. One way of checking whether the call has been inlined is to check the generated code. If you are using gcc, pass the -S flag to the compiler. Of course, it assumes that you can understand the assembly code.



Update on Sep 11, 2018: Vipul Kumar pointed out two compiler flags in his edit.

GCC -Winline

Warn if a function that is declared as inline cannot be inlined. Even with this option, the compiler does not warn about failures to inline functions declared in system headers.

The compiler uses a variety of heuristics to determine whether or not to inline a function. For example, the compiler takes into account the size of the function being inlined and the amount of inlining that has already been done in the current function. Therefore, seemingly insignificant changes in the source program can cause the warnings produced by -Winline to appear or disappear.

As I understand this, if your function is not declared inline, this compiler flag is most likely not helpful. Nevertheless it is good to know it exists and it partly answers your second question.

The other flag that he pointed out is:

Clang -Rpass=inline

Options to Emit Optimization Reports

Optimization reports trace, at a high-level, all the major decisions done by compiler transformations. For instance, when the inliner decides to inline function foo() into bar() [...]

I haven't used this one myself but based on the documentation it might be useful for your use case.

I personally check the generated assembly whenever it is that important.

Community
  • 1
  • 1
Ali
  • 56,466
  • 29
  • 168
  • 265
15

It depends on the optimisation level given to the compiler. Take for example, these two functions, which are semantically identical. One is C++11 style, the other C style.

void foo1 (void)
{
    int arr[100];
    std::generate(std::begin(arr), std::end(arr), [](){return std::rand()%100;});
}

void foo2 (void)
{
    int arr[100];
    for (int *i = arr; i < arr+100; i++) *i = std::rand()%100;
}

Compiling this with gcc -O4 emits code which is extremely similar (not identical, but equivalent complexity) for the two functions.

However the lambda is not inlined when compiling unoptimised (and neither are the std::begin and std::end calls).

So although the compiler can (and does) do an excellent job at optimizing the modern style code when asked to do so, there is possibly or probably going to be a performance penalty for this kind of code in an unoptimized debug build.

pconnell
  • 4,551
  • 1
  • 14
  • 9
  • 19
    There is no actual "-O4" in gcc by the way. – DrYak Aug 29 '14 at 12:34
  • 23
    @DrYak I always compile with "-O9999" because I haven't found the [limit-breaker](http://finalfantasy.wikia.com/wiki/Break_Damage_Limit) yet... Of course I jest. Technically everything above "-O3" translates to "-O3". – Emily L. Oct 13 '14 at 13:33
  • +1 for pointing out the obvious, albeit often ignored fact that people should not ship debug builds :) I have seen it way too often during my professional life and even got told by those guys that my code was "slow", and of course they did ship DEBUG builds lol. – BitTickler Nov 30 '19 at 08:53
  • 3
    @EmilyL. What about `-O-1`? Does it underflow wrap to the max value? :) – Andrew May 31 '21 at 10:11
  • @EmilyL. I think this is the key to getting -O-1, where GCC deliberately tries to slow down the code. Have you ever tried -O[limit-breaker + 1] yet? Does it loop around to negative values? – Shambhav Jan 07 '22 at 13:13