38

I am trying to understand move semantics, rvalue references, std::move, etc. I have been trying to figure out, by searching through various questions on this site, why passing a const std::string &name + _name(name) is less recommended than a std::string name + _name(std::move(name)) if a copy is needed.

If I understand correctly, the following requires a single copy (through the constructor) plus a move (from the temporary to the member):

Dog::Dog(std::string name) : _name(std::move(name)) {}

The alternative (and old-fashioned) way is to pass it by reference and copy it (from the reference to the member):

Dog::Dog(const std::string &name) : _name(name) {}

If the first method requires a copy and move both, and the second method only requires a single copy, how can the first method be preferred and, in some cases, faster?

John Bonata
  • 393
  • 3
  • 6
  • 1
    [Does this help?](https://stackoverflow.com/q/10231349/2069064) – Barry Sep 03 '17 at 23:40
  • 1
    Consider rvalue argument expression versus lvalue. But also consider passing an argument down a call hierarchy, which is the usual situation. – Cheers and hth. - Alf Sep 03 '17 at 23:41
  • @Barry Gotcha. So it's essentially an optimization in the cases when a temporary is passed in and could have just grabbed from that instead. If I understand correctly, for non-temporary lvalues it won't make a difference. – John Bonata Sep 03 '17 at 23:46
  • I've never seen that first case. I've seen `Dog::Dog(std::string&& name) : _name(std::forward(name)) {}`, which would use the move constructor if possible. – Silvio Mayolo Sep 03 '17 at 23:50
  • 1
    @SilvioMayolo `name` is not a forwarding-reference in your example. – tkausl Sep 03 '17 at 23:55
  • 2
    When you pass by value then it's possible that you can do with two moves and no copies. With the const reference it always takes at least one copy. – user253751 Sep 04 '17 at 00:05
  • It's worth noting that this moves any needed copy instructions from inside of the function to all of the call sites. This increases the size of the produced binary and has the potential to lower performance since the repetitive code eats up space in the CPU caches. For small programs this may not be a big deal, but for codebases with millions of lines, this can cause increases in binary size and decreases in performance that may not be acceptable. – cdhowie Sep 04 '17 at 00:55
  • You're overlooking the key feature that the function will go on to modify or otherwise store the parameter, independently of the argument – M.M Sep 04 '17 at 03:44

5 Answers5

35

Consider calling the various options with an lvalue and with an rvalue:

  1. Dog::Dog(const std::string &name) : _name(name) {}
    

    Whether called with an lvalue or rvalue, this requires exactly one copy, to initialize _name from name. Moving is not an option because name is const.

  2. Dog::Dog(std::string &&name) : _name(std::move(name)) {}
    

    This can only be called with an rvalue, and it will move.

  3.  Dog::Dog(std::string name) : _name(std::move(name)) {}
    

    When called with an lvalue, this will copy to pass the argument and then a move to populate the data member. When called with an rvalue, this will move to pass the argument, and then move to populate the data member. In the case of the rvalue, moving to pass the argument may be elided. Thus, calling this with an lvalue results in one copy and one move, and calling this with an rvalue results in one to two moves.

The optimal solution is to define both (1) and (2). Solution (3) can have an extra move relative to the optimum. But writing one function is shorter and more maintainable than writing two virtually identical functions, and moves are assumed to be cheap.

When calling with a value implicitly convertible to string like const char*, the implicit conversion takes place which involves a length computation and a copy of the string data. Then we fall into the rvalue cases. In this case, using a string_view provides yet another option:

  1. Dog::Dog(std::string_view name) : _name(name) {}
    

    When called with a string lvalue or rvalue, this results in one copy. When called with a const char*, one length computation takes place and one copy.

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
Jeff Garrett
  • 5,863
  • 1
  • 13
  • 12
  • +1 You could point out that the reasoning does not necessarily apply to setter functions or assignment operators (in fact, it mostly applies to constructors). See https://stackoverflow.com/questions/18303287/when-is-overloading-pass-by-reference-l-value-and-r-value-preferred-to-pass-by/18303787#18303787 – Arne Vogel Sep 04 '17 at 08:43
  • 1
    You have considered the possibility that `Dog` is being constructed with either an rvalue `std::string` or an lvalue `std::string`, but additionally there is the case that it is being constructed with something implicitly convertible to `std::string`, namely a `const char*`. (So have the other answers, but yours is closest to being complete.) – Arthur Tacca Sep 04 '17 at 09:03
  • Why in the case (4) `When called with a string lvalue or rvalue, this results in one copy. `? one copy happens when bind the parameter to `name`, and there should be another one happens when initialize `_name`? Thanks – dragonxlwang Jan 05 '21 at 22:06
  • `string_view` is reference-like. It doesn't copy the string it is initialized with. – Jeff Garrett Jan 05 '21 at 23:43
  • if we know we're consuming the value passed in, is it sane to only define the r-value accepting version, forcing callers to specify std::move if they have an l-value, so that future people see "oh, that's a move, it's getting consumed" ? – DWR Mar 27 '21 at 14:34
33

When consuming data, you'll need an object you can consume. When you get a std::string const& you will have to copy the object independent on whether the argument will be needed.

When the object is passed by value the object will be copied if it has to be copied, i.e., when the object passed is not a temporary. However, if it happens to be a temporary the object may be constructed in place, i.e., any copies may have been elided and you just pay for a move construction. That is, there is a chance that no copy actually happens.

Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380
  • 8
    There is a third case; if the passed in object is by rvalue reference (ie, the return value of std move). Then by-value does 2 moves and 0 copies, while const ref does 1 copy. – Yakk - Adam Nevraumont Sep 04 '17 at 00:16
  • Regarding your first sentence, I'm confused: Why will you have to copy the object in any case? The point of passing a reference is exactly not to have to copy the object you're refering to. – Alex Dec 30 '19 at 20:33
  • @Alex: note the proviso: “When *consuming* ...”, i.e., when you are using the object to initialize another one. In that case you need a copy but you [normally] know ahead if time that this is going to happen. Doing nothing (due to copy elision) or creating a copy which can then be move (when copy elision can’t be use) is the most affective approach when the copy is needed anyway. – Dietmar Kühl Dec 31 '19 at 00:05
  • Ah, makes sense, sorry for the misunderstanding on my part! – Alex Jan 01 '20 at 16:22
10

Short answer first: call by const& will always cost a copy. Depending on the conditions call by value might only cost one move. But it depends (please have a look at the code examples below for the scenarioa this table refers to):

            lvalue        rvalue      unused lvalue  unused rvalue
            ------------------------------------------------------
const&      copy          copy        -              -
rvalue&&    -             move        -              -
value       copy, move    move        copy           - 
T&&         copy          move        -              -
overload    copy          move        -              - 

So my executive summary would be that call by value is worth to be considered if

  • move is cheap, since there might be an extra move
  • the parameter is unconditionally used. Call by value also costs a copy if the parameter is not used e.g. because of an if clause or sth.

Call by value

Consider a function that is used to copy its argument

class Dog {
public:
    void name_it(const std::string& newName) { names.push_back(newName); }
private:
    std::vector<std::string> names;
};

In case of a lvalue passed to name_it you´ll have two copy operations in case of an rvalue too. Thats bad because the rvalue could me moved.

One possible solution would be to write an overload for rvalues:

class Dog {
public:
    void name_it(const std::string& newName) { names.push_back(newName); }
    void name_it(std::string&& newName) { names.push_back(std::move(newName)); }
private:
    std::vector<std::string> names;
};

That solves the problem and everything is fine, despite that you have two code two functions with exactly the same code.

Another viable solution would be to use perfect forwarding, but that also has several disadvantages, (e.g. perfect forwarding functions are quite greedy and render an existing overloaded const& function useless, typically they will need to be in a header file, they create several functions in the object code and some more.)

class Dog {
public:
    template<typename T>
    void name_it(T&& in_name) { names.push_back(std::forward<T>(in_name)); }
private:
    std::vector<std::string> names;
};

Yet Another solution would be to use call by value:

class Dog {
public:
    void name_it(std::string newName) { names.push_back(std::move(newName)); }
private:
    std::vector<std::string> names;
};

The important thing is, as you mentioned the std::move. This way you will have one function for both rvalue and lvalue. You will move rvalues but accept an additional move for lvalues, which might be fine if moving is cheap and you copy or move the parameter regardless of conditions.

So at the end I really think it´s plain wrong to recommend one way over the others. It strongly depends.

#include <vector>
#include <iostream>
#include <utility>

using std::cout;

class foo{
public:
    //constructor
    foo()  {}
    foo(const foo&)  { cout << "\tcopy\n" ; }
    foo(foo&&)  { cout << "\tmove\n" ; }
};

class VDog {
public:
    VDog(foo name) : _name(std::move(name)) {}
private:
    foo _name;
};

class RRDog {
public:
    RRDog(foo&& name) : _name(std::move(name)) {}
private:
    foo _name;
};

class CRDog {
public:
    CRDog(const foo& name) : _name(name) {}
private:
    foo _name;
};

class PFDog {
public:
    template <typename T>
    PFDog(T&& name) : _name(std::forward<T>(name)) {}
private:
    foo _name;
};

//
volatile int s=0;

class Dog {
public:
    void name_it_cr(const foo& in_name) { names.push_back(in_name); }
    void name_it_rr(foo&& in_name)   { names.push_back(std::move(in_name));}
    
    void name_it_v(foo in_name) { names.push_back(std::move(in_name)); }
    template<typename T>
    void name_it_ur(T&& in_name) { names.push_back(std::forward<T>(in_name)); }
private:
    std::vector<foo> names;
};


int main()
{
    std::cout << "--- const& ---\n";
    {
        Dog a,b;
        foo my_foo;
        std::cout << "lvalue:";
        a.name_it_cr(my_foo);
        std::cout << "rvalue:";
        b.name_it_cr(foo());
    }
    std::cout << "--- rvalue&& ---\n";
    {
        Dog a,b;
        foo my_foo;
        std::cout << "lvalue: -\n";
        std::cout << "rvalue:";
        a.name_it_rr(foo());
    }
    std::cout << "--- value ---\n";
    {
        Dog a,b;
        foo my_foo;
        std::cout << "lvalue:";
        a.name_it_v(my_foo);
        std::cout << "rvalue:";
        b.name_it_v(foo());
    }
    std::cout << "--- T&&--\n";
    {
        Dog a,b;
        foo my_foo;
        std::cout << "lvalue:";
        a.name_it_ur(my_foo);
        std::cout << "rvalue:";
        b.name_it_ur(foo());
    }
    
    
    return 0;
}

Output:

--- const& ---
lvalue: copy
rvalue: copy
--- rvalue&& ---
lvalue: -
rvalue: move
--- value ---
lvalue: copy
    move
rvalue: move
--- T&&--
lvalue: copy
rvalue: move
Community
  • 1
  • 1
DrSvanHay
  • 1,170
  • 6
  • 16
0

Outside of the performance reasons, when a copy throws an exception on a by-value constructor it is thrown on the caller first and not within the constructor itself. This makes it easier to code noexcept constructors and not have to worry about resource leaks or a try/catch block on a constructor.

struct A {
    std::string a;

    A( ) = default;
    ~A( ) = default;
    A( A && ) noexcept = default;
    A &operator=( A && ) noexcept = default;

    A( A const &other ) : a{other.a} {
        throw 1;
    }
    A &operator=( A const &rhs ) {
        if( this != &rhs ) {
            a = rhs.a;
            throw 1;
        }
        return *this;
    }
};

struct B {
    A a;

    B( A value ) try : a { std::move( value ) }
    { std::cout << "B constructor\n"; }
    catch( ... ) {
        std::cerr << "Exception in B initializer\n";
    }
};

struct C {
    A a;

    C( A const &value ) try : a { value }
    { std::cout << "C constructor\n"; }
    catch( ... ) {
        std::cerr << "Exception in C initializer\n";
    }
};

    int main( int, char ** ) {

    try {
        A a;
        B b{a};
    } catch(...) { std::cerr << "Exception outside B2\n"; }



    try {
        A a;
        C c{a};
    } catch(...) { std::cerr << "Exception outside C\n"; }

    return EXIT_SUCCESS;
}

Will output

Exception outside B2
Exception in C initializer
Exception outside C
Beached
  • 1,608
  • 15
  • 18
0

I made an experiment:

#include <cstdio>
#include <utility>

struct Base {
  Base() { id++; }
  static int id;
};

int Base::id = 0;

struct Copyable : public Base {
  Copyable() = default;
  Copyable(const Copyable &c) { printf("Copyable [%d] is copied\n", id); }
};

struct Movable : public Base {
  Movable() = default;

  Movable(Movable &&m) { printf("Movable [%d] is moved\n", id); }
};

struct CopyableAndMovable : public Base {
  CopyableAndMovable() = default;

  CopyableAndMovable(const CopyableAndMovable &c) {
    printf("CopyableAndMovable [%d] is copied\n", id);
  }

  CopyableAndMovable(CopyableAndMovable &&m) {
    printf("CopyableAndMovable [%d] is moved\n", id);
  }
};

struct TEST1 {
  TEST1() = default;
  TEST1(Copyable c) : q(std::move(c)) {}
  TEST1(Movable c) : w(std::move(c)) {}
  TEST1(CopyableAndMovable c) : e(std::move(c)) {}

  Copyable q;
  Movable w;
  CopyableAndMovable e;
};

struct TEST2 {
  TEST2() = default;
  TEST2(Copyable const &c) : q(c) {}
  //  TEST2(Movable const &c) : w(c)) {}
  TEST2(CopyableAndMovable const &c) : e(std::move(c)) {}

  Copyable q;
  Movable w;
  CopyableAndMovable e;
};

int main() {
  Copyable c1;
  Movable c2;
  CopyableAndMovable c3;
  printf("1\n");
  TEST1 z(c1);
  printf("2\n");
  TEST1 x(std::move(c2));
  printf("3\n");
  TEST1 y(c3);

  printf("4\n");
  TEST2 a(c1);
  printf("5\n");
  TEST2 s(c3);

  printf("DONE\n");
  return 0;
}

And here is the result:

1
Copyable [4] is copied
Copyable [5] is copied
2
Movable [8] is moved
Movable [10] is moved
3
CopyableAndMovable [12] is copied
CopyableAndMovable [15] is moved
4
Copyable [16] is copied
5
CopyableAndMovable [21] is copied
DONE

Conclusion:

template <typename T>
Dog::Dog(const T &name) : _name(name) {} 
// if T is only copyable, then it will be copied once
// if T is only movable, it results in compilation error (conclusion: define separate move constructor)
// if T is both copyable and movable, it results in one copy

template <typename T>
Dog::Dog(T name) : _name(std::move(name)) {}
// if T is only copyable, then it results in 2 copies
// if T is only movable, and you called Dog(std::move(name)), it results in 2 moves
// if T is both copyable and movable, it results in one copy, then one move.
warchantua
  • 1,154
  • 1
  • 10
  • 24