2

Whenever I want to implement a class that needs a move constructor and some form of a copy constructor I find myself wondering about the following:

For an example class C that stores a std::vector<std::string> _s, should I prefer:

C(std::vector<std::string> s) : _s(std::move(s)) {}
C(std::vector<std::string>&& s) : _s(std::move(s)) {}

or

C(const std::vector<std::string>& s) : _s(s) {}
C(std::vector<std::string>&& s) : _s(std::move(s)) {}

?

The move constructor is straight-forward, but for the other there is a slight difference: The first option will copy upon calling the constructor and then move the copy into _s. For the second option, I am avoiding the copy by specifying the parameter to take a reference only to then copy the content myself. So in any case, I end up with one copy taking place for my "copy constructor" (the real and the fake one). Maybe an explanation towards the cost of the move that happens inside C(std::vector<std::string> s) : _s(std::move(s)) {} would help but I also feel like not knowing what the right questions are.

Given my understanding I don't really see what the better option is.

huzzm
  • 489
  • 9
  • 24
  • 1
    There are a few strong schools of thought of differing opinion. For a **sink** parameter, I favor `C(std::vector s) : _s{std::move(s)} {}` as the pattern. No `std::vector&&` variant needed. C++23 will provide a better alternative, but I'm stuck in C++17 land. The opposing school of thought would have only the `&&` version, and force the callsite to explicitly do what is needed. – Eljay Jun 23 '23 at 16:48
  • 2
    Related to [pass-by-value-vs-pass-by-rvalue-reference](https://stackoverflow.com/questions/37935393/pass-by-value-vs-pass-by-rvalue-reference) and [taking-sink-parameters-by-rvalue-reference-instead-of-by-value-to-enforce-performance](https://stackoverflow.com/questions/61597860/taking-sink-parameters-by-rvalue-reference-instead-of-by-value-to-enforce-perfor) – Jarod42 Jun 23 '23 at 16:52

2 Answers2

3

Howard Hinnant in a talk highly recommended implementing each special member function individually and not implement one in terms of the other. For performance, he also discouraged the use of the copy-and-swap idiom in the assignment operator, which is similar to the first option shown in the question (C(std::vector<std::string> s) : _s(std::move(s)) {}). Such an unified assignment operator can accept both rvalues and lvalues. That is true for this constructor also.

He shows empirical evidence based on a class containing a vector (just like in the question). Especially, if the vector in the target has sufficient capacity, copy-and-swap based assignment can be eight times slower than a dedicated assignment operator. Thus, in the case of C(std::vector<std::string> s) : _s(std::move(s)) {} accepting an rvalue, there will be unnecessary moving of the rvalue to the parameter s (which is an lvalue) and it is definitely not desirable.

Thus, separate constructors for lvalue and rvalue arguments are needed. In the case of lvalue being passed, whether the copying should happen at the constructor parameter stage or in the initialization list could be profiled and found out. However, passing an lvalue argument as reference to const lvalue is idiomatic and similar to what a copy constructor would do.

Update: HolyBlackCat has addressed the scalability of the design, which is very important. One more way to achieve it is via parameter pack based forwarding references. However, one should be careful about the pitfall of such a constructor superseding other constructors and one should constrain the template using SFINAE or concepts (C++20). Items 25, 26 and 27 of Effective Modern C++ by Scott Meyers discusses about these aspects.

Update: In the comments below, Brian Bi has raised the question "How costly is the move that you are trying to save?", when you prefer

C(const std::vector<std::string> &s) : _s(s) {}
C(std::vector<std::string>&& s) : _s(std::move(s)) {}

to

C(std::vector<std::string> s) : _s(std::move(s)) {}

Indeed, the extra move incurred by pass-by-value is not an issue for several types. Thus, being mindful about the types involved is important. Item 41 of the above book discusses about this debate between the two choices, more generally, not just for constructors. Some guidelines from that book:

... use overloading or universal references instead of pass by value unless it's been demonstrated that pass by value yields acceptably efficient code for the parameter type that you need.

... for software that must be as fast as possible, pass by value may not be a viable strategy ...

The last point is alluded to in HolyBlackCat's answer also.

Hari
  • 1,561
  • 4
  • 17
  • 26
  • 1
    For assignment operators, yes, it can be a pessimization to take the argument by value, because when you do so, you force a copy construction of the argument, and copy construction could be much slower than an optimal copy assignment. But when you're implementing a constructor itself, taking the argument by value isn't as likely to be a pessimization. You have to construct the member no matter what. – Brian Bi Jun 24 '23 at 01:02
  • The analogy with assignment operators is to compare having only `C(std::vector s) : _s(std::move(s)) {}`, instead of `C(const std::vector &s) : _s(s) {}` and `C(std::vector&& s) : _s(std::move(s)) {}`. Just like how copy-and-swap based unified assignment operator results in a wasteful move construction of the parameter when passed an rvalue, `C(std::vector s) : _s(std::move(s)) {}` would do the same. Also, that talk is insightful for the present discussion but it conveys its message via special member functions and the copy-and-swap idiom. – Hari Jun 24 '23 at 05:03
  • 1
    but the extra move construction is rarely important. You'll notice Hinnant doesn't actually advise against this pattern for constructors. – Brian Bi Jun 24 '23 at 13:50
  • ok. Point noted. I will investigate and understand this aspect better. – Hari Jun 24 '23 at 17:12
3

For starters, there are no move constructors here. That would be C(C &&).

By default, prefer a lone

C(std::vector<std::string> s) : _s(std::move(s)) {}

This nicely scales to >1 argument.

If you're writing code that needs to be optimal (aka low-level library classes), do this:

C(const std::vector<std::string>& s) : _s(s) {}
C(std::vector<std::string>&& s) : _s(std::move(s)) {}

This saves exactly one move per call compared to the previous option.

This doesn't scale well to N>1 arguments, requiring either 2N overloads or perfect-forwarding templates.


To clarify, scalability is not the reason I recommend one over the other. Verbosity is.

In general, I'm of the opinion that low-level language features are for low-level library code. High-level business logic shouldn't care how to better pass arguments to a constructor, it should use the simplest sane syntax available (option 1). This is what most programmers should do most of the time.

Somewhat high-level library code would use moderately verbose syntax to increase efficiency (option 2).

Hardcore low-level library code would probably use something more convoluted (a pack of forwarding references, with the appropriate requires, noexcept, perhaps explicit(bool), perhaps a second overload for std::initializer_list<T> as the first argument).

HolyBlackCat
  • 78,603
  • 9
  • 131
  • 207