C++ lambda not capturing variable on 2nd expansion in template?

Question

I have some tortuous code in a template that uses @R. Martinho Fernandes's trick to loop unroll some packed parameters in a variadic template and invoke the same code on each argument in the argument list.

However, it seems as though the lambdas are not being initialized properly and they are instead sharing variables across functor(?) instances, which seems wrong.

Given this code:

#include <iostream>
#include <functional>

template<typename... Args>
void foo(Args ... args) {
  int * bar = new int();
  *bar = 42;

  using expand_type = int[];
  expand_type{(
    args([bar]() {
        std::cerr<<std::hex;
        std::cerr<<"&bar="<<(void*)&bar<<std::endl;
        std::cerr<<"  bar="<<(void*)bar<<std::endl;
        std::cerr<<"  bar="<<*bar<<std::endl<<std::endl;
    }),
    0) ... 
  };
};

int main() {
  std::function<void(std::function<void()>)> clbk_func_invoker = [](std::function<void()> f) { f(); };
  foo(clbk_func_invoker, clbk_func_invoker);

  return 0;
}

I get the following output:

&bar=0x7ffd22a2b5b0
  bar=0x971c20
  bar=2a

&bar=0x7ffd22a2b5b0
  bar=0
Segmentation fault (core dumped)

So, what I believe I'm seeing is that the two functor instances share the same address for captured variable bar, and after the invocation of the first functor, bar is being set to nullptr, and then the second functor seg'-faults when it tries to dereference the same bar variable ( in the exact same address ).

FYI, I realize that I can work around this issue by moving the [bar](){... functor into a variable std::function variable and then capturing that variable. However, I would like to understand why the second functor instance is using the exact same bar address and why it is getting a nullptr value.

I ran this with GNU's g++ against their trunk version retrieved and compiled yesterday.

Yakk - Adam Nevraumont · Accepted Answer · 2016-08-24T19:04:17.877

Parameter packs with lambdas in them tend to give compilers fits. One way to avoid that is to move the expansion part and the lambda part separate.

template<class F, class...Args>
auto for_each_arg( F&& f ) {
  return [f=std::forward<F>(f)](auto&&...args){
    using expand_type = int[];
    (void)expand_type{0,(void(
      f(decltype(args)(args))
    ),0)...};
  };
}

This takes a lambda f and returns an object that will invoke f on each of its arguments.

We can then rewrite foo to use it:

template<typename... Args>
void foo(Args ... args) {
  int * bar = new int();
  *bar = 42;

  for_each_arg( [bar](auto&& f){
    f( [bar]() {
      std::cerr<<std::hex;
      std::cerr<<"&bar="<<(void*)&bar<<std::endl;
      std::cerr<<"  bar="<<(void*)bar<<std::endl;
      std::cerr<<"  bar="<<*bar<<std::endl<<std::endl;
    } );
  } )
  ( std::forward<Args>(args)... );
}

live example.

I initially thought it had to do with the std::function constructor. It does not. A simpler example without a std::function that crashes the same way:

template<std::size_t...Is>
void foo(std::index_sequence<Is...>) {
  int * bar = new int();
  *bar = 42;

  using expand_type = int[];
  expand_type{(
    ([bar]() {
      std::cerr<<"bar="<<*bar<<'\n';
    })(),
    (int)Is) ... 
  };
}

int main() {
  foo(std::make_index_sequence<2>{});

  return 0;
}

we can invoke the segfault without the cerr, giving us disassembly that is easier to read:

void foo<3, 0ul, 1ul>(std::integer_sequence<unsigned long, 0ul, 1ul>)::{lambda()#1}::operator()() const:
    pushq   %rbp
    movq    %rsp, %rbp
    movq    %rdi, -8(%rbp)
    movq    -8(%rbp), %rax
    movq    (%rax), %rax
    movl    $3, (%rax)
    nop
    popq    %rbp
    ret
void foo<3, 0ul, 1ul>(std::integer_sequence<unsigned long, 0ul, 1ul>):
    pushq   %rbp
    movq    %rsp, %rbp
    pushq   %rbx
    subq    $40, %rsp
    movl    $4, %edi
    call    operator new(unsigned long)
    movl    $0, (%rax)
    movq    %rax, -24(%rbp)
    movq    -24(%rbp), %rax
    movl    $42, (%rax)
    movq    -24(%rbp), %rax
    movq    %rax, -48(%rbp)
    leaq    -48(%rbp), %rax
    movq    %rax, %rdi
    call    void foo<3, 0ul, 1ul>(std::integer_sequence<unsigned long, 0ul, 1ul>)::{lambda()#1}::operator()() const
    movabsq $-4294967296, %rax
    andq    %rbx, %rax
    movq    %rax, %rbx
    movq    $0, -32(%rbp)
    leaq    -32(%rbp), %rax
    movq    %rax, %rdi
    call    void foo<3, 0ul, 1ul>(std::integer_sequence<unsigned long, 0ul, 1ul>)::{lambda()#1}::operator()() const
    movl    %ebx, %edx
    movabsq $4294967296, %rax
    orq     %rdx, %rax
    movq    %rax, %rbx
    nop
    addq    $40, %rsp
    popq    %rbx
    popq    %rbp
    ret

I have yet to parse the disassembly, but it obviously trashes the state of the second lambda when playing with the first.

Personally, I'm inclined to think it is actually a compiler bug in GCC, but I would like to be absolutely certain before filing it. I know you should almost never blame the compiler, but maybe this time... Also, I have a work around for my real case. I wouldn't be surprised if this was a relatively unique test case for the compiler. The code is rather preposterous. — Ross Rogers, Aug 24 '16 at 20:38
@Yakk you didn't answered the core issue, *..they are instead sharing variables across functor(?) instances..* why expanding the pack don't create a copy of bar for each expansion instead of sharing just one copy of bar for all expanded closures, what make this kind of expansion so special ? — Jans, Aug 24 '16 at 21:31
@xhamr because support for lambdas inside parameter pack expansion is flakey? I gave up doing it because most compilers failed to compile legal code. — Yakk - Adam Nevraumont, Aug 25 '16 at 01:53
@ross I am very certain your compiler screwed up. Figuring out how rewuires reading the disassembly or the gcc source code or both. — Yakk - Adam Nevraumont, Aug 25 '16 at 01:54
@Yakk, all was about the evaluation of the lambdas not the expansion, I posted what I found. — Jans, Aug 25 '16 at 17:52

score 2 · Answer 2 · answered Aug 24 '16 at 18:53

First of all I don't have a solution, I'd want to add this extra information as a comment, but unfortunately I can't comment yet.

I tried your previous code with Intel 17 c++ compiler and worked fine:

&bar=0x7fff29e40c50
  bar=0x616c20
  bar=2a

&bar=0x7fff29e40c50
  bar=0x616c20
  bar=2a

In some cases the &bar (the address of the new variable used to store the captured value) was different between the first call and the second, but it also worked.

I also tried your code with GNU's g++ changing the type of bar from int* to int. The captured value was wrong in the second and successive calls even in this case:

&bar=0x7fffeae12480
  bar=2a
&bar=0x7fffeae12480
  bar=0
&bar=0x7fffeae12480
  bar=0

Finally I tried modifying a bit the code and passing by value and object, so the copy constructor must be called:

#include <iostream>
#include <functional>

struct  A {
    A(int x) : _x(x) { 
      std::cerr << "Constructor!" << n++ << std::endl;
    }
    A(const A& a) : _x(a._x) {
      std::cerr << "Copy Constructor!"  << n++ << std::endl;
    }
    static int n;
    int _x;  
};

int A::n = 0;

template<typename... Args>
void foo(Args ... args) {
  A a(42);

  std::cerr << "-------------------------------------------------" << std::endl;
  using expand_type = int[];
  expand_type  {
   (args( [a]() {
          std::cerr << "&a, "<< &a << ", a._x," << a._x << std::endl;
         }
       ),
    0) ... 
  };
std::cerr << "-------------------------------------------------" << std::endl;
}

int main() {
  std::function<void(std::function<void()>)> clbk_func_invoker = [](std::function<void()> f) { f(); };
  foo(clbk_func_invoker, clbk_func_invoker, clbk_func_invoker);
  return 0;
}

My current version of g++ (g++ (GCC) 6.1.0) is not able to compile this code. I also tried with Intel and it worked, although I don't fully understand why the copy constructor is called so many times:

Constructor!0
-------------------------------------------------
Copy Constructor!1
Copy Constructor!2
Copy Constructor!3
&a, 0x617c20, a._x,42
Copy Constructor!4
Copy Constructor!5
Copy Constructor!6
&a, 0x617c20, a._x,42
Copy Constructor!7
Copy Constructor!8
Copy Constructor!9
&a, 0x617c20, a._x,42
-------------------------------------------------

That's all I tested so far.

Jans · Answer 3 · 2016-08-25T17:57:42.910

After a few test I found that all was about the evaluation of the lambdas not the pack expansion.

What you have is a set of lambdas that do not execute until the pack expansion complete so that at the moment of execution all of them observe the same instance of the variables, which would be different if the execution of each lambda correspond with the order of the expansion, then each expansion will get it's own copy of the variable and the lambda would be considered a materialized prvalue which lifetime has ended:

template<typename... Args>
void foo(Args ... args) {
    int * bar = new int();
    *bar = 42;

    using expand_type = int[];
    expand_type{( args([bar]{
       std::cerr<<std::hex;
       std::cerr<<"&bar="<<(void*)&bar<<std::endl;
       std::cerr<<"  bar="<<(void*)bar<<std::endl;
       std::cerr<<"  bar="<<*bar<<std::endl<<std::endl;
       return 0;
    }()),0) ...
  };
};

int main() {
    std::function<void(int)> clbk_func_invoker = [](int) {  };    
    foo(clbk_func_invoker, clbk_func_invoker);    
    return 0;
}

live example.

However, the compiler is able to do a little optimization even when evaluated lambdas are expanded and not executed while expanding for trivial classes under capture by copy.

Let put a more simple example:

struct A{ };

template<class... T>
auto foo(T... args){
  A a;
  std::cout<< &a << std::endl;
  using expand = int[];

 expand{ 0,(args([a] { 
      std::cout << &a << " " << std::endl; return 0; }),void(),0)... 
 };
}

foo([](auto i){ i(); }, [](auto i){  i(); });

Will output the same address of a for every expanded lambda, even when individuals copy of a are expected. since capture by copy produce a constant version of the copied variable and no mutation can be made no these copies, for trivial classes is a kind of performance to share the same instance through all expanded lambdas(because no change is guaranteed).

But if the type now is not a trivial one, that optimization has been broken and different copies are required for each expanded lambda:

 struct A{ 
    A() = default;
    A(const A&){}
 };

This change on A cause different address for a appear in the output.

So you're saying that the lambdas' capture arguments are misclassified as "trivial" when they aren't actually "trivial"? The variable that I am passing in is a pointer. I would expect the pointer to be copied by value and for that pointer to be as close to a "trivial" item as possible. The weird thing is that the pointer is set to 0 after the first lambda is invoked. That looks like some sort of destructor behavior or possibly even stack corruption. — Ross Rogers, Aug 25 '16 at 19:04
In my specific case I would actually _expect_ `&bar` to be the same on each invocation since the lambda is on the stack and it is two copies of the exact same lambda. What is weird is that the value of the pointer is being modified to `0x0`. — Ross Rogers, Aug 25 '16 at 19:09
Pointers doesn't have notion of trivial class because their refer to an address which may have an object with "trivial class" or not, in this case both "int" and "A"(as default constructible) are trivial, the pointer is not copied in the original code because all lambdas are expanded in place and at the moment of execution all of them see the same variable even if it's a pointer, but in the first snapshot i posted. — Jans, Aug 25 '16 at 19:42
If you look each expansion of the lambda is also an execution which produce a different evaluation, that's why in that case the pointer is copied. and BTW the pointer isn't set to 0x0 neither in g++ nor in clang. — Jans, Aug 25 '16 at 19:42
Can you help me understand why execution of the original code prints this out on the 2nd invocation? `bar=0` That looks a great deal like the pointer is being set to `null`. — Ross Rogers, Aug 25 '16 at 20:05
@Yakk you're right, It's weird that compiling under g++ seem that after the first lambda execution the pointer is cleaned, I didn't notice that. — Jans, Aug 25 '16 at 20:41

C++ lambda not capturing variable on 2nd expansion in template?

3 Answers3