Creating std::function with lambda causes superfluous copying of the lambda object - why?

Question

When I am constructing std::function with lambda with captured values it makes an additional copy (move) of those parameters (actually the of the whole lambda object I guess). The code:

#include <iostream>
#include <functional>

// Testing class - just to see constructing/destructing.
class T {
private:
    static int idCounter; // The global counter of the constructed objects of this type.
public:
    const int id; // Unique object ID 

    inline T() : id(++idCounter) { 
        std::cout << "  Constuctor Id=" << id << std::endl;
    };
    inline T(const T& src) : id(++idCounter) {
        std::cout << "  Copy constructor Id=" << id << std::endl;
    }
    inline T(const T&& src) : id(++idCounter) {
        std::cout << "  Move constructor Id=" << id  << std::endl;
    }
    inline void print() const {
        std::cout << "  Print is called for object with id=" << id << std::endl;
    }
    inline ~T() {
        std::cout << "  Destructor Id=" << id << std::endl;
    }
};

int T::idCounter=0; 

// Declare type of the std::function to store our lambda.
typedef std::function<int (void)> Callback;

int main()
{ 
    std::cout << "Let's the game begin!" << std::endl;
    T obj; // Custruct the first object.
    std::cout << "Let's create a pointer to the lambda." << std::endl;
    // Make a labmda with captured object. (The labmda prints and returns object's id).
    // It should make one (local) copy of the captured object but it makes it twice - why?!
    const Callback* pcb= new Callback( [obj]() -> int { 
        obj.print();
        return obj.id; 
    } );
    std::cout << "Now let's print lambda execution result."  << std::endl;
    std::cout << "The functor's id is " << (*pcb)() << std::endl;
    std::cout << "Destroying the lambda." << std::endl;
    delete pcb;
    std::cout << "Terminating." << std::endl;
    return 0;

}

The output is:

  Let's the game begin!
   Constuctor Id=1
  Let's create a pointer to the lambda.
   Copy constructor Id=2
   Move constructor Id=3
  Destructor Id=2
   Now let's print lambda execution result.
   Print is called for object with id=3
   The functor's id is 3
  Destroying the lambda.
   Destructor Id=3
  Terminating.
   Destructor Id=1

I made a std:function with lambda with captured object. It should make a local copy of the object for lambda but it make the copy twice (look at move constructor call - highlighted with bold). Actually it make a copy of the whole lambda object. Why? How can I avoid that? I am using lambdas for inter-thread event processing and they may capture noticeable amounts of date so I am trying to find a way to avoid unnecessary copying. So the task is simple - to pass constructed lambda into the function with minimal expenses - if it will copy data twice for every constructed lambda I would search for another way to work with events.
I am using GCC v4.7.2 forced to GNU C++11.

The move is done when moving the lambda in the initialization-list of the constructor of `std::function`. This *moving-the-lambda* forces the captured object to move as well (i.e recursively moving!)> — Nawaz, May 07 '14 at 10:35
@op, moving is not copying (of course you can implement it like that, but why would you?). A sensible implementation for your test class would be to not increment the id but instead take the id of the moved (temporary) object to the new instance. — Tamás Szelei, May 07 '14 at 11:20
In real life in complex project you can't guarantee the moving is cheap. You are using third-party libraries, multithreading issues etc. As an example - is moving of sdt:vector with 10k string cheap? — Sap, May 07 '14 at 11:30

Nawaz · Answer 1 · 2014-05-07T11:49:02.057

Well, the output is confusing because there is one copy-elision performed by the compiler. So in order to understand the behaviour, we need to disable the copy-elision for a while. Use -fno-elide-constructors flag when compiling the code:

$ g++ -std=c++11 -fno-elide-constructors main.cpp

Now it gives this output (demo-without-copy-elision):

Let's create a pointer to the lambda.
  Copy constructor Id=2
  Move constructor Id=3
  Move constructor Id=4
  Destructor Id=3
  Destructor Id=2

Well, that is expected. The copy is done when creating the lambda:

 [obj]() -> int { 

//^^^^ COPY!

    obj.print();
    return obj.id; 
}

Well, that is too obvious!

Now coming to the non-obvious thing : the two move operations!

The first move is done when passing the lambda to the constructor of std::function, because the lambda is an rvalue, hence move-constructor is called. Note that -fno-elide-constructors disables move-elision also (which is just a supposedly faster version of copy, after all!).

The second move is done, when writing (by moving of course) to the member data of std::function in the constructor initialization-list.

So far so good.

Now if you remove -fno-elide-constructors, the compiler optimizes away the first move (because of which it doesn't invoke the move constructor), which is why the output is this:

Let's create a pointer to the lambda.
  Copy constructor Id=2
  Move constructor Id=3
  Destructor Id=2

See demo-with-copy-elision.

Now the move you see now, is because of moving-the-lambda into the member data of std::function. You cannot avoid this move.

Also note that copying/moving the lambda also causes copying/moving the captured data (i.e recursively copying/moving).

Anyway, if you're worrying about copying the captured object (assuming it is a huge object), then I would suggest you to create the captured object using new so that copying the captured object means copying a pointer (4 or 8 bytes!). That should work great!

Hope that helps.

Can I avoid it? Can I make pointer to a lambda without (second) moving? — Sap, May 07 '14 at 11:42
Pointer to lambda? I don't think there is any clean way to do that (or even possible). Anyway, if you're worrying about copying the captured object (assuming it is huge object), then I would suggest you to create the captured object using `new` so that copying the captured object means copying a pointer (4 or 8 bytes!). — Nawaz, May 07 '14 at 11:46

score 2 · Answer 2 · answered May 07 '14 at 10:49

2

It does not make copy twice. Moving is considered a cheap operation, and practically in 99% of the cases it is. For 'plan old data' types (structs, ints, doubles, ...) the double-copying is a non-issue as most compilers eliminate redundant copies (data-flow analysis). For containers, moving is a very cheap operation.

answered May 07 '14 at 10:49

Alex Shtoff

2,520
1
25
53

I am using big and complex external data objects. "moving is cheap operation" is not the point. Sometimes it may be cheap. But may be not. And a faith is not a good way in programming. As I see it's impossible to use lambdas without double copying in GCc C++11 :( – Sap May 07 '14 at 11:20
What do you mean by 'external'? Can't you write a "cheap" move constructor for your data objects? – Alex Shtoff May 07 '14 at 11:26
In 'external' I mean that the captured objects are created outside of the function creating lambda so usually they can not be optimized as good as locally created objects. And am using third-party libraries so I can not be sure all data supports move constructors efficiently. Yes I can write wrappers or use shared pointers but I am surprised by the fact that there are no way to use (store) lambdas without double copying. – Sap May 07 '14 at 11:36

score 0 · Answer 3 · edited May 23 '17 at 11:51

0

As mentioned by Nawaz in the comments, the extra move operation that you are worried about is performed when the lambda expression is moved into the std::function<int(void)> (typedef'ed as Callback).

const Callback* pcb= new Callback( [obj]() -> int { 
    obj.print();
    return obj.id; 
} );

Here the object obj is passed by value (copy constructed) to the lambda expression, but additionally, the entire lambda expression is passed as an r-value to the constructor of Callback (std::function) and is therefore move-copied into the std::function object. When moving the lambda, all states must also be moved along and hence the obj is also moved (there are actually two move constructions of obj involved but one of them is usually optimized out by the compiler).

Equivalent code:

auto lambda = [obj]() -> int {                        // Copy obj into lambda.
    obj.print();
    return obj.id; 
};

const Callback* pcb= new Callback(std::move(lambda)); // Move lambda (and obj).

Move operations are considered cheap and won't cause any costly copying of data (in most cases).

You can read more about move semantics here: What are move semantics?.

Finally If you don't want to copy obj then simply capture it by reference in the lambda:

const Callback* pcb= new Callback( [&obj]() -> int { 
    obj.print();
    return obj.id; 
} );

edited May 23 '17 at 11:51

Community

1
1

answered May 07 '14 at 11:14

Felix Glas

15,065
7
53
82

Is it possible to use lambdas without double-copying? All I need is a "pointer-to-the-lambda". With std:function double copying is inevitable, isn't it? That's a bad news. – Sap May 07 '14 at 11:26
1

@user3544995 You don't have to wrap the lambda in a `std::function`. Simply call it directly e.g. `auto pcb = [&obj] { /* ... */ }; std::cout << "The functor's id is " << pcb() << std::endl;`. This way no copies have to be made, ever. – Felix Glas May 07 '14 at 11:34
1

Heh :) call it directly... And what about to store and pass them to other functions? For event handling I need the lambda to be stored into the event-object, passed to another function (another thread actually) and then executed. How can I do that with auto type? What type would the object field holding the lambda have? – Sap May 07 '14 at 11:39
@user3544995 By using templates the compiler can deduce the type of the lambda so that you can pass references to it around among your functions. If threads are involved and you have to make a local copy to avoid data races, well then you *have* to make the copy anyway. – Felix Glas May 07 '14 at 11:44
@user3544995 Example of a function that takes a reference to any callable (as lambda) and calls it: `template void func(const T& t) { t(); }`. No copying involved. – Felix Glas May 07 '14 at 11:48
Could you tell how to use that in practice? what type the field of a class must have to store a lambda? – Sap May 07 '14 at 12:14
@user3544995 If you have a lambda e.g. `auto lambda = [&obj] { /* ... */ };` then you can use the above function and call it like this: `func(lamba);`. This will call the `()` operator of the lambda functor from within the function `func` and it will not involve any copies of either `obj` or the lambda. – Felix Glas May 07 '14 at 12:22
@user3544995 The type of a lambda is implementation defined and can not be explicitly declared by the programmer. Instead it must be deduced by the compiler, either by using `auto`, `decltype` or templates among others. This does *not* limit you though when you have access to the mentioned capabilities. – Felix Glas May 07 '14 at 12:28
I see now. But can lambda be wrapped into any template class? Something like lightweight std:function? I am still trying to avoid that second copying/moving by writing my own function-like wrapper. – Sap May 07 '14 at 13:27

Creating std::function with lambda causes superfluous copying of the lambda object - why?

3 Answers3

Linked

Related