3

In a code review, my co-worker and I were discussing the interface for a function I was writing. Our code base uses C++17 and we do not use exceptions (video games).

I claimed that the idiomatic C++ way of taking sink parameters would be performant while also keeping the interface flexible, allowing the caller to pass a copy or move from an owned value as desired. By idiomatic way, I mean either one function taking the parameter by value, or an overoad set for const lvalue and rvalue references (which requires one less move in the lvalue case, at the cost of some code duplication).

struct A {};

class ByValue
{
public:
    ByValue(std::vector<A> v)
        : m_v(std::move(v))
    {}

private:
    std::vector<A> m_v;
};

class RefOverloads
{
public:
    RefOverloads(std::vector<A> const& v)
        : m_v(v)
    {}

    RefOverloads(std::vector<A>&& v)
        : m_v(std::move(v))
    {}

private:
    std::vector<A> m_v;
};

int main()
{
    std::vector<A> v0;
    ByValue value0(v0);
    ByValue value1(std::move(v0));

    std::vector<A> v1;
    RefOverloads ref0(v1);
    RefOverloads ref1(std::move(v1));
}

My coworker on the other hand does not like that it is easy to implicitly make expensive copies. He would prefer that these sink arguments are always by rvalue reference (with no const lvalue ref overload), and that if the caller wishes to pass a copy they must make a local copy and move it into the function.

class RvalueRefOnly
{
public:
    RvalueRefOnly(std::vector<A>&& v)
        : m_v(std::move(v))
    {}

private:
    std::vector<A> m_v;
};

int main()
{
    std::vector<A> v;
    //RvalueRefOnly failedCopy(v); // Fails purposefully.

    std::vector<A> vCopy = v;               // Explicit copy of v.
    RvalueRefOnly okCopy(std::move(vCopy)); // Move into okCopy.
}

I've never even thought of an interface like this. A counter-argument I had was that taking by value expresses intent better, i.e, with signature

void f(T x);

the caller knows that f has taken ownership of x. With

void g(T&& x);

g may have ownership or may not, depending on the implementation of f.

Is there a best way? Am I missing some argument one way or the other?

  • Pass by value will, in most cases, be perfectly optimized by your compiler and value semantics are easier to reason about for the human behind the screen writing code. I'd go with pass by value when possible, unless your profiler shows you a real reason not to. Don't make it complicated unless you have a *proven* need to do so. – Jesper Juhl May 04 '20 at 17:17
  • What is your end goal, to get the most performance? – NathanOliver May 04 '20 at 17:17
  • This is entirely an API design question and is probably just a matter of opinion. Pass by value will probably be just as fast when used with move semantics. The question is whether you want to impose on the user the requirement that they make their own copy and move it when they intend to, to avoid accidental copies when it wasn't intended. You could argue it's safer where a copy would be damaging, and you can argue that it is inconvenient if that isn't a common case. – François Andrieux May 04 '20 at 17:18
  • 1
    @JesperJuhl *Pass by value will, in most cases, be perfectly optimized by your compiler* Really? I know return by value is, but pass by value AFAIK leaves a lot of performance on the table in most cases. – NathanOliver May 04 '20 at 17:18
  • @NathanOliver I've seen lots of cases where the compiler will optimize pass-by-value into moves and even when it doesn't, the cases where it *actually* matters more than "programmer comprehension of the code" have been few and far between. – Jesper Juhl May 04 '20 at 17:21
  • 1
    Another way of putting it : Your coworker's argument is *"My coworker does not like that it is easy to implicitly make expensive copies."* Your coworker is right *if* that concern is a problem. That depends on the specific unique circumstances of your project or product. In my opinion, there isn't an objective answer to this question. – François Andrieux May 04 '20 at 17:25
  • Leaving object creation for the caller will lead to constructor code being invoked (and probably inlined) at every call location bloating the executable. So i'd say that the rule of thumb is to implement overloads for `const &` and `&&` when constructor is rather large / expensive or to accept by value if it is some primitive / lightweight object. `std::vector` clearly falls into first category. – user7860670 May 04 '20 at 17:26
  • You can have your cake and eat it too with perfect forwarding: http://coliru.stacked-crooked.com/a/b9cb267b1b3c219c – NathanOliver May 04 '20 at 17:32
  • To clarify: The question is not by value vs. (const& and && overloads). It is (value or (const& and && overloads)) vs. (only &&, no const&). – Michael Deom May 04 '20 at 17:51
  • Related to [pass-by-value-vs-pass-by-rvalue-reference](https://stackoverflow.com/questions/37935393/pass-by-value-vs-pass-by-rvalue-reference) – Jarod42 May 04 '20 at 18:50

1 Answers1

2

You basically have those constructor options:

class myclass {
public:
    // #1 myclass(const std::string& s) : s(s) {}
    // #2 myclass(std::string&& s) : s(std::move(s)) {}

    // #3 myclass(std::string s) : s(std::move(s)) {}

    // #4 template <typename T> myclass(T&& t) : s(std::forward<T>(t)) {}

    std::string s;
};

#3 cannot be present with #1 or #2 -> ambiguous call

and call

std::string s;
myclass A(s);
myclass B(std::string(s));
myclass C(std::move(s));
myclass D("temporary");
myclass E({5, '*'});

Following is count of copy/move constructor.

                        | A | B | C | D | E |
------------------------|---|---|---|---|---|
1                  Copy |<1>| 2 | 1 | 1 | 1 | <.> Denotes best result (by column)
const l-value ref  Move |<0>| 1 | 1 | 0 | 0 |
                  Other |<0>| 0 | 0 | 1 | 1 |
------------------------|---|-v-|-v-|-v-|-v-| B/C/D/E would prefer overload 2
2                  Copy | X |<1>|<0>| 0 |<0>|
r-value ref        Move | X |<1>|<1>| 1 |<1>| X denotes invalid case
                  Other | X |<0>|<0>| 1 |<1>|
------------------------|---|---|---|---|---|
3                  Copy | 1 | 1 | 0 | 0 |<0>|
by value           Move | 1 | 2 | 2 | 1 |<1>|
                  Other | 0 | 0 | 0 | 1 |<1>|
------------------------|---|---|---|---|---|
4                  Copy |<1>|<1>|<0>|<0>| X |
Forwarding ref     Move |<0>|<1>|<1>|<0>| X |
                  Other |<0>|<0>|<0>|<1>| X |
--------------------------------------------/

Possible configurations:

  • #1 only: handle all cases, but does copy for temporary
  • #1/#2: (B/C/D/E would use #2), so best results except for in-place construct
  • #3 only: Handle all cases, but does extra move
  • #4 only: Handle most regular cases, best results
  • #1/#2/#4: best results (Notice that #4) has exact match over non-const l-value)
  • #2/#4: Best results
  • #2 only: Forbid copy, but explicit copy (B) does 1 extra move than #1/#2

As you can see:

  • Forwarding reference (#4) has best results.
  • By const ref (#1) has best performance for copy, but worse performance for other.
  • Then By Value (#3) is the second "worst", but only one extra move from the best.

Other points:

  • only #1 alone was available pre-C++11 (so was the default in most interface)
  • only #1 might mean no ownership transfers.
  • Move only (#2) forbids implicit copy
  • Purpose of By Value #3 is to write only one overload as good compromise.

Now compare #1/#2, #2 and #3:

  • For move only type:

  • #1/#2 is irrelevant

  • #3 sinks immediately.

  • #2 has the opportunity to handle exception guarantee: if it throws, object is not necessary consumed.

Unless you want guaranteed and/or immediate sink, I would use pass-by-rvalue (#2).

For existing code base, I would keep consistency.

For copyable types:

  • #1/#2 is the most efficient, but allows unwanted copy.
  • #3 is convenient (and only one extra move), but allows unwanted copy, guaranteed sink.
  • #2 avoid unwanted copy.

Now it is mostly what you want to guaranty and allow:

  • best performance -> #1/#2
  • no (implicit) copies -> #2
  • immediate/guaranteed sink -> #3
  • consistency with movable only types

For existing code base, I would keep consistency.

Jarod42
  • 203,559
  • 14
  • 181
  • 302