Overhead of recursive lambdas

Question

Do recursive lambda functions induce any overhead comparing to regular recursive functions (since we have to capture them into a std::function) ?
What is the difference between this function and a similar one using only regular functions ?

int main(int argc, const char *argv[])
{
    std::function<void (int)> helloworld = [&helloworld](int count) {
        std::cout << "Hello world" << std::endl;
        if (count > 1) helloworld(--count);
    }; 
    helloworld(2);
    return 0; 
}

Your main problem will be less compiler optimization http://ideone.com/QsZVfH — stefan, Jun 12 '13 at 14:16
Nice ! Do you have an idea why ? The compiler couldn't just guess that this code is similar to a regular function call (no-capture but itself) ? — 3XX0, Jun 12 '13 at 14:28
@fscan I have to since it's a recursive lambda function. Type inference doesn't work here — 3XX0, Jun 12 '13 at 14:29
sorry, didn't know about this special case, will delete the comment. — fscan, Jun 12 '13 at 14:36
The language does not give lambda expressions the ability of calling themselves. This is important in order to understand what your snippet entails. — Luc Danton, Jun 12 '13 at 16:33
I see a lambda closure that calls a `std::function` by reference, not a recursive lambda. That `std::function` is then assigned to by the closure. `std::function` isn't a lambda closure, and a lambda closure isn't a `std::function`. Lambda closures can be stored in `std::function`s, and closures can capture `std::function`s by reference or value. The ability for a lambda closure to actually invoke itself recursively (and not through some intermediary) is not currently available in C++11... In a sense, this is better described as a "recursive `std::function`" than a recursive lambda. — Yakk - Adam Nevraumont, Jun 12 '13 at 19:32

stefan · Accepted Answer · 2013-06-13T08:02:44.203

6

There is overhead using lambdas recursively by storing it as a std::function, although they are itself basically functors. It seems that gcc is not able to optimize well which can be seen in a direct comparison.

Implementing the behaviour of a lambda, i.e. creating a functor, enables gcc of optimizing again. Your specific example of a lambda could be implemented as

struct HelloWorldFunctor
{
   void operator()(int count) const
   {
      std::cout << "Hello world" << std::endl;
      if ( count > 1 )
      {
         this->operator()(count - 1);
      }
   }
};
int main()
{
   HelloWorldFunctor functor;
   functor(2);
}

For the example I've created the functor would look like in this second demo.

Even if one introduces calls to impure functions such as std::rand, the performance without a recursive lambda or with a custom functor is still better. Here's a third demo.

Conclusion: With the usage of a std::function, there's overhead, although it might be negligible depending on the use case. Since this usage prevents some compiler optimizations, this shouldn't be used extensively.

edited Jun 13 '13 at 08:02

answered Jun 12 '13 at 14:46

stefan

10,215
4
49
90

This is true, thanks for the code, although I suspect it is very compiler-specific. Is there a way to pass compiler flags on ideone? – Pedrom Jun 12 '13 at 14:51
@Pedrom I don't think so. Perhaps if you register? Disclaimer: I did not yet intensively used lambdas since the projects I work on need backwards compatibility with compilers that don't support lambdas. – stefan Jun 12 '13 at 14:52
I am trying to validate this on my machine, but for some reason it doesn't show me any output using cygwin :/ – Pedrom Jun 12 '13 at 14:57
@Pedrom no output at all? that's weird. Try replacing the `"\n"` with `std::endl`. – stefan Jun 12 '13 at 14:59
I am actually trying on a linux server and I am getting Segmentation Fault which might explain the huge difference (because the exception and the assert) – Pedrom Jun 12 '13 at 15:04
@Pedrom Try with a much smaller `n`. You're experiencing a stack overflow. – stefan Jun 12 '13 at 15:05
I could check... the difference is less dramatic but still significant. I wanted to compare it with visual studio but chrono is not available on VS10 so I would have to make some changes. Interesting anyway... – Pedrom Jun 12 '13 at 15:23
@Pedrom _any_ time measurement will suffice, the choice of implementation doesn't really affect the results. I'd be interested to know how VS10 copes as I don't have any compiler on my Windows machine – stefan Jun 12 '13 at 15:25
Yeah, I am actually doing that right now, but I haven't ever used QueryPerformanceCounter before... and I am at work so probably I will do this at lunch – Pedrom Jun 12 '13 at 15:32
Any benchmark that starts with undefining `NDEBUG` is highly suspect to me. What is the purpose of doing that? – Sebastian Redl Jun 12 '13 at 15:54
@SebastianRedl in the second demo, to enable the assertions. in the third demonstration, a useless relict, which is now removed. – stefan Jun 12 '13 at 16:00
There is a comment on this answer which pretty much explains the difference: http://stackoverflow.com/a/14591727/1112142 Since the lambda is wrapped on std:function the call is to a virtual function which *does* have an overhead. – Pedrom Jun 12 '13 at 16:01
7

It is ***not*** the lambda that creates the overhead. I repeat, it is ***not*** the lambda. It's `std::function`, which uses type-erasure to store the lambda. – Xeo Jun 12 '13 at 17:22
@Xeo Thanks for pointing that out. I adressed that in an update of my answer. – stefan Jun 12 '13 at 17:52

score 3 · Answer 2 · answered Jun 12 '13 at 16:57

So std::function is implemented polymorphically. Meaning your code is roughly equivalent to:

struct B
{
    virtual void do(int count) = 0;
};

struct D
{
    B* b;
    virtual void do(int count)
    {
        if (count > 1)
            b->do(count--);
    }
};

int main()
{
    D d;
    d.b = &d;
    d.do(10);
}

It is rare to have a tight enough recursion such that a virtual method lookup is a significant overhead, but depending on your application area it is certainly possible.

score 2 · Answer 3 · edited May 23 '17 at 10:26

2

Lambdas in C++ are equivalent to functors so it would be the same that calling the operator() of some class created automatically by the compiler. When you capture the environment what's happening behind the scenes is that those captured variables are passed to the constructor of the class and stored as member variables.

So in short, the performance difference should be very close to zero.

Here there is some further explanation:

Jump to the section "How are Lambda Closures Implemented?" http://www.cprogramming.com/c++11/c++11-lambda-closures.html

EDIT:

After some research and thanks to Stefan answer and code, it turned out that there is an overhead on recursive lambdas because of the std::function idiom. Since the lambda has to be wrapped on a std::function in order to call itself, it involves to call a virtual function which does add an overhead.

The subject is treated on the comments of this answer: https://stackoverflow.com/a/14591727/1112142

edited May 23 '17 at 10:26

Community

1
1

answered Jun 12 '13 at 13:38

Pedrom

3,823
23
26

Stephan Lavavej presents the similar idea [here](http://channel9.msdn.com/Series/C9-Lectures-Stephan-T-Lavavej-Core-C-/Stephan-T-Lavavej-Core-C-9-of-n) (at 0:38:20). Thus it seems that calling a regular function would induce less overhead. However I remember reading somewhere that this functor approach could be optimized with function pointers in some cases. Is that true ? what about the case I provided ? – 3XX0 Jun 12 '13 at 14:21
@3XX0 That's a nice video, thanks for the link :) However he said that if you plan to using it over and over it would make sense to write a regular function, but that's just because it is easier to call in different parts of the code, he is not talking about performance. Regarding the function pointers thing, you could use them *instead of* lambdas, but I don't think they are going to be as convenient and in practice I hadn't seen any performance difference. – Pedrom Jun 12 '13 at 14:36
@3XX0 As for your example, it is exactly like calling a function void hello_world(myfunc foo, int count) if there is *any* overhead it would be at the first call (because you need to instantiate an object foo to pass it to the method, but after that all the following calls are exactly the same in terms of performance. Moreover, if your recursive function is a method then there is not difference at all. – Pedrom Jun 12 '13 at 14:40
@3XX0 I was thinking about the optimization that you asked and yeah it is possible to optimize the call a bit to avoid the bind to the std::function variable, but I think that in that case would make more sense to call a plain function as the lambda version would lose its readability advantage. – Pedrom Jun 12 '13 at 16:29

Overhead of recursive lambdas

3 Answers3