0

the test code is like this ,timer will output the elapsed time on destruction

    uint64_t res1 = 0, res2 = 0;
    void test_accumulate_bind_function(uint64_t& x, uint64_t i)
    {
         x += i;
    }
    uint64_t res1 = 0, res2 = 0, res3 = 0, res4=0;
    template <typename Function>
    void do_loop_ref(Function & func, const uint64_t upper_limit = 100000)
    {
        for (uint64_t i = 0; i < upper_limit; ++i)
            func(i);
    }

    template <typename Function>
    void do_loop_forward(Function && func, const uint64_t upper_limit = 100000)
    {
        Function f(std::forward<Function>(func));
        for (uint64_t i = 0; i < upper_limit; ++i)
            f(i);
    }

    template <typename Function>
    void do_loop_copy(Function func, const uint64_t upper_limit = 100000)
    {
        for (uint64_t i = 0; i < upper_limit; ++i)
            func(i);
    }

    void test_bind_copy()
    {
        {
            namespace arg = std::placeholders;
            uint64_t x = 0;
            auto accumulator = std::bind(&test_accumulate_bind_function, std::ref(x), arg::_1);
            std::cout << "reference:";
            timer t;
            do_loop_ref(accumulator);
            res1 = x;
        }
        {
            namespace arg = std::placeholders;
            uint64_t x = 0;
            auto accumulator = std::bind(&test_accumulate_bind_function, std::ref(x), arg::_1);
            std::cout << "copy:";
            timer t;
            do_loop_copy(accumulator);
            res2 = x;
        }
        {
            namespace arg = std::placeholders;
            uint64_t x = 0;
            auto accumulator = std::bind(&test_accumulate_bind_function, std::ref(x), arg::_1);
            std::cout << "localcopy:";
            timer t;
            do_loop_forward(accumulator);
            res3 = x;
        }
        {
            namespace arg = std::placeholders;
            uint64_t x = 0;
            auto accumulator = std::bind(&test_accumulate_bind_function, std::ref(x), arg::_1);
            std::cout << "move:";
            timer t;
            do_loop_forward(std::move(accumulator));
            res4 = x;
        }
        printf("res1:%lld, res2:%lld, res3:%lld, res4:%lld\n", res1, res2, res3, res4);
    }

    void test_copy()
    {
        {
            uint64_t x = 0;
            auto accumulator = [&x](uint64_t i){ return x += i; };
            std::cout << "reference:";
            timer t;
            do_loop_ref(accumulator);
            res1 = x;
        }
        {
            uint64_t x = 0;
            auto accumulator = [&x](uint64_t i){ return x += i; };
            std::cout << "copy:";
            timer t;
            do_loop_copy(accumulator);
            res2 = x;
        }
        {
            uint64_t x = 0;
            auto accumulator = [&x](uint64_t i){ return x += i; };
            std::cout << "localcopy:";
            timer t;
            do_loop_forward(accumulator);
            res3 = x;
        }
        {
            uint64_t x = 0;
            auto accumulator = [&x](uint64_t i){ return x += i; };
            std::cout << "move:";
            timer t;
            do_loop_forward(std::move(accumulator));
            res4 = x;
        }

        printf("res1:%lld, res2:%lld, res3:%lld, res4:%lld\n", res1, res2, res3, res4);
    }

int main()
{
    test_copy();
    test_bind_copy();
}

on my computer(vs2013) the output is:

reference: 196 copy: 65 localcopy: 196 move: 64

res1:4999950000, res2:4999950000, res3:4999950000, res4:4999950000

reference: 359 copy: 361 localcopy: 358 move: 358

res1:4999950000, res2:4999950000, res3:4999950000, res4:4999950000

so why in lambda call, pass by value is so faster than by reference. also i test lambda capture a empty string by reference, the output is like above;but when capture a empty string by value, ref and copy cost time will be approach.

bashrc's answer remind me, and i add two test, the result is interesting,move cost almost the same as pass by value, but if copy cost most time, why pass by value is faster than by reference;

1 Answers1

0

Did you ran the benchmarks with compilation optimizations turned on? On compiling with -O3 in gcc both code look exactly the same in assembly(and hence should perform exactly the same).

Without any optimization flag the code still looks the same except there is an additional level of indirection when you are passing by reference.

 // pass by reference

 mov     rax, QWORD PTR [rbp-24]                

// pass by value
lea     rax, [rbp-32]  

https://www.diffchecker.com/cCBmQuW7

std::function are internally just structs wrapping the internal state of the function (the captured variables in case of lambda). Passing the function by reference can be imagined as passing a pointer to the struct with all the state that's captured by the function. Hence the caller and callee are working on the same memory location. When passed by value the caller and callee have independent copies and write and read on one doesn't effect another.

In the above case when the function is passed by value the address is loaded as is while when its passed by reference there is an additional memory read before calling the lambda. Check What is the difference between MOV and LEA

The extra level of indirection may bring into play things like locality of reference but that would be very tightly coupled to the hardware on which this gets executed. But a lot depends on how is this code compiled and where is the code executed and how is it measured(you haven't shared the code for timer). So with the limited information from the question, the extra read is what I would blame the additional time on.

Community
  • 1
  • 1
bashrc
  • 4,725
  • 1
  • 22
  • 49
  • I add two test cases, but the result is confusing, pass by value is the same as move to local f and pass by reference is the same as copy to local f – user7849408 Apr 12 '17 at 09:06
  • "std::function are internally just structs wrapping the internal state of the function (the captured variables in case of lambda)" not really, no. `std::function` is a type eraser, not a wrapper, and as such does not know anything about about the type of the stored function, let alone about it's state (e.g. captured variables of lambda as you stated). – bolov Apr 12 '17 at 10:37
  • @bolov Agreed. But I didn't wanted to deviate from the question. Type erasure would have required a longer footnote. What I wanted to highlight was that the captured state would travel with the function in a functor object. Also whether std::function uses type erasure or some other magic is left on the implementation to decide. – bashrc Apr 12 '17 at 12:05
  • @bashrc please forgive me for nitpicking, but `std::function` is by definition a type erasure class. A function pointer, a type with `operator()` defined, a lambda without capture or a lambda with capture (all accepting 2 int params and returning an int) - all are stored to the same type: `std::function`. This is type erasure. The information about the original type is lost. The type erasure technique used to achieve this is indeed left to the implementation (most likely it's a form of `void*` + polymorphism). – bolov Apr 12 '17 at 13:23