5

Assume I have the following class, which has a method set_value. Which implementation is better?

class S {
public:
  // a set_value method

private:
  Some_type value;
};

Pass by value, then move

void S::set_value(Some_type value)
{
  this->value = std::move(value);
}

Define two overloaded methods

void S::set_value(const Some_type& value)
{
  this->value = value;
}

void S::set_value(Some_type&& value)
{
  this->value = std::move(value);
}

The first approach requires definition of one method only while the second requires two.

However, the first approach seems to be less efficient:

  1. Copy/Move constructor for the parameter depending on the argument passed
  2. Move assignment
  3. Destructor for the parameter

While for the second approach, only one assignment operation is performed.

  1. Copy/Move assignment depending on which overloaded method is called

So, which implementation is better? Or does it matter at all?

And one more question: Is the following code equivalent to the two overloaded methods in the second approach?

template <class T>
void S::set_value(T&& value)
{
  this->value = std::forward<T>(value);
}
  • 3
    The answer requires a longer discussion. You should read chapter 5 in [Effective Modern C++](http://shop.oreilly.com/product/0636920033707.do). Quick version: Pass by value may be less efficient and fail for non-copyables. Two overloads create redundant code, especially with multiple parameters. The template version is efficient and short, but has bad error messages and fails on `initializer_list`s and overload resolution. Pick your poison, it's a case by case judgment call. – nwp Jul 01 '15 at 06:40
  • Related: http://stackoverflow.com/a/26286741 – dyp Jul 01 '15 at 07:22

1 Answers1

0

The compiler is free to elide (optimise away) the copy even if there would be side effects in doing so. As a result, passing by value and moving the result actually gives you all of the performance benefits of the two-method solution while giving you only one code path to maintain. You should absolutely prefer to pass by value.

here's an example to prove it:

#include <iostream>

struct XYZ {
    XYZ() { std::cout << "constructed" << std::endl; }

    XYZ(const XYZ&) {
        std::cout << "copy constructed" << std::endl;
    }
    XYZ(XYZ&&) noexcept {
        try {
            std::cout << "move constructed" << std::endl;
        }
        catch(...) {

        }
    }

    XYZ& operator=(const XYZ&) {
        std::cout << "assigned" << std::endl;
        return *this;
    }

    XYZ& operator=(XYZ&&) {
        std::cout << "move-assigned" << std::endl;
        return *this;
    }

};

struct holder {
    holder(XYZ xyz) : _xyz(std::move(xyz)) {}

    void set_value(XYZ xyz) { _xyz = std::move(xyz); }
    void set_value_by_const_ref(const XYZ& xyz) { _xyz = xyz; }

    XYZ _xyz;
};
using namespace std;

auto main() -> int
{
    cout << "** create named source for later use **" << endl;
    XYZ xyz2{};

    cout << "\n**initial construction**" << std::endl;
    holder h { XYZ() };

    cout << "\n**set_value()**" << endl;
    h.set_value(XYZ());

    cout << "\n**set_value_by_const_ref() with nameless temporary**" << endl;
    h.set_value_by_const_ref(XYZ());

    cout << "\n**set_value() with named source**" << endl;
    h.set_value(xyz2);

    cout << "\n**set_value_by_const_ref() with named source**" << endl;
    h.set_value_by_const_ref(xyz2);
    return 0;
}

expected output:

** create named source for later use **
constructed

**initial construction**
constructed
move constructed

**set_value()**
constructed
move-assigned

**set_value_by_const_ref() with nameless temporary**
constructed
assigned

**set_value() with named source**
copy constructed
move-assigned

**set_value_by_const_ref() with named source**
assigned

note the absence of any redundant copies in the copy/move versions but the redundant copy-assignment when calling set_value_by_const_ref() with nameless temporary. I note the apparent efficiency gain of the final case. I would argue that (a) it's a corner case in reality and (b) the optimiser can take care of it.

my command line:

c++ -o move -std=c++1y -stdlib=libc++ -O3 move.cpp
Richard Hodges
  • 68,278
  • 7
  • 90
  • 142
  • 1
    Copy and move elision only applies to the *construction of new objects*, and only in certain situations (temporaries, RVO, NRVO). Move assignment is not covered by the exceptions of copy/move elision, it must always be performed under the as-if rule. – dyp Jul 01 '15 at 07:17
  • @dyp respectfully, this is a common misconception. The compiler is explicitly free to construct the parameter of the `set_value` function in-place when `set_value` is called with a nameless temporary. The move is also explicitly allowed to be optimised away. – Richard Hodges Jul 01 '15 at 07:27
  • I'm not talking about the function parameter. I'm talking about the *move assignment* inside the setter function. You don't discriminate between *move construction* and *move assignment* in the text of your answer (*"The compiler is free to elide (optimise away) the copy and move"*), so even if you do know this distinction wrt move elision, the text is IMHO misleading. – dyp Jul 01 '15 at 08:11
  • 1
    *"gives you all of the performance benefits of the two-method solution "* And this wrong if you assign from lvalues to types that maintain resources, such as `std::string` or `std::vector`. A `const&` setter function allows to reuse the existing resource managed by the LHS of the assignment, whereas a by-value setter forces the user to acquire a new resource, if the argument is an lvalue. – dyp Jul 01 '15 at 08:12
  • @RichardHodges When `set_value` is called with nameless temporary, I know that both approaches are equally efficient because of the in-place construction of the parameter. However, when it is called by named variable, `set_value(x)` will invoke the copy constructor for the parameter, and `set_value(std::move(x))` will invoke the move constructor for the parameter before the move assignment. While for the two-method version, the argument is directly referred without extra construction. For this case, isn't pass-by-value version less efficient? –  Jul 01 '15 at 08:43
  • @dyp assigning from a const& resource-handler would necessitate either a copy of the internal resource or a copy of the shared_ptr to it. Either way, the copy/move idiom is at least as efficient and easier to maintain. – Richard Hodges Jul 01 '15 at 08:59
  • @dyp updated the answer to reflect all our points. – Richard Hodges Jul 01 '15 at 09:13
  • Herb Sutter's talk "Back to the Basics!" contains a [detailed comparison of the various ways how to write a setter function](https://www.youtube.com/watch?v=xnqTKD8uD64&feature=youtu.be&t=3066). The basic argument is as follows: Function arguments are often lvalues, and you often use them in tight loops where performance matters the most. So the setter function should be optimal for lvalues. When you copy-assign to a `std::string` or `std::vector`, the LHS can reuse an existing resource, and the RHS doesn't need to release its resource. This is not possible when the setter takes by value. – dyp Jul 01 '15 at 09:54
  • @dyp while I accept Herb's position I think he is making the mistake of 'premature optimisation'. By all means if it transpires that a tiny object is being copied multiple times into a container then pass-by-reference could be warranted. The algorithm `std::fill` is an example of this (it uses `insert(x)` rather than `emplace(x)`). My main concern for our development is correctness. I only allow optimisation (such as having 2 versions of a setter) when it's required - which is actually in reality never, since modern computers have more power than the sun... – Richard Hodges Jul 01 '15 at 10:57
  • I don't quite get your argument: As far as I understand Herb Sutter's position, he's advocating using a single `const&` overload by default. If it then turns out you need optimizations, add a `&&` overload. You seem to be arguing that one should use a single by-value overload by default (potentially splitting into `const&` and `&&` for optimization?). So why use by-value rather than `const&` by default? (Also, Sutter's argument is not "a tiny object is being copied multiple times into a container", but avoiding an expensive acquisition of a resource, e.g. for "big" objects such as vectors). – dyp Jul 01 '15 at 11:03
  • @dyp vectors are move-aware. they're cheap to move and as cheap to acquire as to copy – Richard Hodges Jul 01 '15 at 11:06
  • *"and as cheap to acquire as to copy"* This is too general. Vectors can be much more cheap to copy-assign than to copy-construct: when the lhs of a copy-assignment has enough capacity, *no new memory needs to be allocated* (for the default and many other allocators). And this recursively applies to the element type. Copy construction OTOH always needs to acquire new memory. – dyp Jul 01 '15 at 11:33
  • @dyp I completely understand, and it's a very well made point. I still think it smacks of premature optimisation. When are we ever going to be assigning multiple copies of the same vector of expensive (copyable) objects in a tight loop? Code like that would be unlikely to make it past code review. I guess ultimately arguments like this over philosophy are just that - we all have good reasons for what we say. I think we're as correct as each other. – Richard Hodges Jul 01 '15 at 11:38
  • I know that *preferring `const&` over by-value* must be made based on the use cases. Both have situations where they are more efficient than the other one. But your answer still claims that *"passing by value and moving the result actually gives you all of the performance benefits of the two-method solution"* and its introductory text is still IMO misleading, because it does not discriminate between assignment and construction. -- `while(cin >> mystring) { myarr[i++].set_name(mystring); }` is an example where you'd want to reuse existing capacity within the array elements (or SSO). – dyp Jul 01 '15 at 11:49
  • @dyp wouldn't you prefer this `while(cin >> mystring) { myarr[i++].set_name(move(mystring)); }` ? Now the redundant copy has been removed. – Richard Hodges Jul 01 '15 at 11:54
  • @RichardHodges No, I wouldn't prefer that: it steals `mystring`'s resource (memory), so that the next `cin >> mystring` needs to reacquire a new resource. Moving in the string discards (!) the resource of the string in `myarr[i]`. -- That is, the move version has to acquire a new resource for each element of the array, while the copy version only needs to acquire a new resource for those elements of the array where the existing resource is insufficient. – dyp Jul 01 '15 at 12:08
  • @dyp I have just watched Herb Sutter's talk and the answer of the [post](http://stackoverflow.com/questions/26261007/why-is-value-taking-setter-member-functions-not-recommended-in-herb-sutters-cpp/26286741#26286741). I'm convinced that using the const-ref version (plus a rvalue-ref when necessary) is preferable to avoid unnecessary deallocations. Another question which then comes in my mind is that when defining the assignment operator, is it bad to use the copy and swap idiom for the same reason? I learned the copy and swap idiom from the book 'C++ Primer'. –  Jul 01 '15 at 15:05
  • 1
    @DannyYL The copy-and-swap idiom is a simplification, whereas the default way of passing an argument to a setter (by-value or by-const-ref) is more of a choice between similarly complex solutions. In copy-and-swap vs explicitly implementing assignment, you have to choose between performance and simplicity. Since you shouldn't have to deal often with implementing assignment (rule of zero) other than for generic low level data types / low level abstractions, it typically makes sense to consider performance as a potential issue (generality => you don't know how it's used). – dyp Jul 01 '15 at 15:19
  • @dyp to be pendantic, arguing that moving from a string 'steals' its resources is actually making a claim about the implementation of std::string which you cannot in actuality make. There a few implementations out there - simple wrappers around new/delete, copy-on-write and the 'small-string-as-value' being three of them. In two of those the vast majority of cases (short strings) will not steal or allocate any resources at all. In the case of long strings it ceases to matter since the cost of copying text starts to outweigh the cost of allocation. I still believe you're optimising prematurely. – Richard Hodges Jul 01 '15 at 15:51
  • I'm not very familiar with the `basic_string` class template, I'm more familiar with `vector`. However, copy-on-write has been made illegal in C++11 to better support concurrency (some COW const functions have to modify characters), and from what I can see / know from `vector`, the allocator interface and the noexceptness of the move ctor effectively prevent memory allocation for move construction. SSO is very limited, it supports something between 8 and 32 8-bit characters on (not crazy) 64-bit implementations. But we're actually talking about a reasonable default and how it treats resources – dyp Jul 01 '15 at 16:31