16

Consider the following:

struct vec
{
    int v[3];

    vec() : v() {};
    vec(int x, int y, int z) : v{x,y,z} {};
    vec(const vec& that) = default;
    vec& operator=(const vec& that) = default;
    ~vec() = default;

    vec& operator+=(const vec& that)
    {
        v[0] += that.v[0];
        v[1] += that.v[1];
        v[2] += that.v[2];
        return *this;
    }
};

vec operator+(const vec& lhs, const vec& rhs)
{
    return vec(lhs.v[0] + rhs.v[0], lhs.v[1] + rhs.v[1], lhs.v[2] + rhs.v[2]);
}
vec&& operator+(vec&& lhs, const vec& rhs)
{
    return move(lhs += rhs);
}
vec&& operator+(const vec& lhs, vec&& rhs)
{
    return move(rhs += lhs);
}
vec&& operator+(vec&& lhs, vec&& rhs)
{
    return move(lhs += rhs);
}

Thanks to r-value references, with these four overloads of operator+ I can minimize the number of objects created, by reusing temporaries. But I don't like the duplication of code this introduces. Can I achieve the same with less repetition?

R. Martinho Fernandes
  • 228,013
  • 71
  • 433
  • 510

4 Answers4

36

Recycling temporaries is an interesting idea and you're not the only one who wrote functions that return rvalue references for this reason. In an older C++0x draft operator+(string&&,string const&) was also declared to return an rvalue reference. But this changed for good reasons. I see three issues with this kind of overloading and choice of return types. Two of them are independent of the actual type and the third argument refers to the kind of type that vec is.

  1. Safety issues. Consider code like this:

    vec a = ....;
    vec b = ....;
    vec c = ....;
    auto&& x = a+b+c;
    

    If your last operator returns an rvalue reference, x will be a dangling reference. Otherwise, it won't. This is not an artificial example. For example, the auto&& trick is used in the for-range loop internally to avoid unnecessary copies. But since the life-time extension rule for temporaries during reference binding does not apply in case of a function call that simply returns a reference, you'll get a dangling reference.

    string source1();
    string source2();
    string source3();
    
    ....
    
    int main() {
      for ( char x : source1()+source2()+source3() ) {}
    }
    

    If the last operator+ returned an rvalue reference to the temporary that is created during the first concatenation, this code would invoke undefined behaviour because the string temporary would not exist long enough.

  2. In generic code, functions that return rvalue references force you to write

    typename std::decay<decltype(a+b+c)>::type
    

    instead of

    decltype(a+b+c)
    

    simply because the last op+ might return an rvalue reference. This is getting ugly, in my humble opinion.

  3. Since your type vec is both "flat" and small, these op+ overloads are hardly useful. See FredOverflow's answer.

Conclusion: Functions with an rvalue reference return type should be avoided especially if these references may refer to short-lived temporary objects. std::move and std::forward are special-purpose exceptions to this rule of thumb.

sellibitze
  • 27,611
  • 3
  • 75
  • 95
  • 5
    Upvoted. Nice summary of the dangers here, and the history. I'm the one guilty of introducing this bug into the early drafts. You can find it first proposed here: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n1377.htm#string%20Motivation And we fixed it with an LWG issue here (which I voted in favor of): http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#1138 . – Howard Hinnant May 17 '11 at 14:22
  • I'm not sure I'm satisfied with objection (2) regarding. We don't really need to use `decay` on the `decltype`. Or, more precisely, we only need to use it *if we've already decided we don't like `&&`-refs as return values*. That seems like a circular argument to me. – Aaron McDaid Jul 11 '15 at 20:11
  • ... (continued) ... for me, it goes back to the first objection regarding safety. The `auto &&` within `for(:)` is confusing. You should make clear you are referring to the right hand side of the `:`, not the left hand side. – Aaron McDaid Jul 11 '15 at 20:14
  • 1
    ... (continued) ... On the right hand side of the `:` within a `for(:)`, would you expect this to work?: `for (auto x : make_a_vector_of_vectors().front()) { .. }` My point with this example is that `&`-refs are not better than `&&`-refs. Developers should consider carefully the lifetime on the right hand side of the `:` when there is a function all in it. – Aaron McDaid Jul 11 '15 at 20:17
  • (Sorry for all these comments. I think I've made distinct points in each. And I must say that I'm being Devil's Advocate to some extent - I might use `&&` as return much less often in future as a result of reading this answer. Thanks!) – Aaron McDaid Jul 11 '15 at 20:18
  • @HowardHinnant It seems like those bugs are still present in std::get(std::tuple<...>) flavors. That is, when passed an rvalue reference tuple, they return rvalue references. Is this different for some reason? Is there any plan to correct this? Edit: I recently asked a related question, http://stackoverflow.com/questions/31359829/why-does-get-helper-of-stdtuple-return-rvalue-reference-instead-of-value/31360610?noredirect=1#comment50703282_31360610, and was led here by Aaron McDaid. – Nir Friedman Jul 11 '15 at 22:53
9

Since your vec type is "flat" (there is no external data), moving and copying do exactly the same thing. So all your rvalue references and std::moves gain you absoutely nothing in performance.

I would get rid of all additional overloads and just write the classic reference-to-const version:

vec operator+(const vec& lhs, const vec& rhs)
{
    return vec(lhs.v[0] + rhs.v[0], lhs.v[1] + rhs.v[1], lhs.v[2] + rhs.v[2]);
}

In case you have little understanding of move semantics yet, I recommend studying this question.

Thanks to r-value references, with these four overloads of operator+ I can minimize the number of objects created, by reusing temporaries.

With a few exceptions, returning rvalue references is a very bad idea, because calls of such functions are xvalues instead of prvalues, and you can get nasty temporary object lifetime problems. Don't do it.

Community
  • 1
  • 1
fredoverflow
  • 256,549
  • 94
  • 388
  • 662
  • [(chat bookmark)](http://chat.stackoverflow.com/transcript/message/719418#719418) – Fred Nurk May 15 '11 at 12:06
  • @FredNurk: Any C++0x expert will tell you that returning rvalue references in your own code is wrong. Ask Stephan T. Lavavej or Douglas Gregor, for example. – fredoverflow May 15 '11 at 12:13
  • Do you have a blog post, mailing list archive, or something else where they say this? – Fred Nurk May 15 '11 at 12:26
  • @FredNurk: You can read Dave Abrahams' and Howard Hinnant's opinion on the issue [here](http://objectmix.com/c/751862-c-0x-returning-rvalue-references-recycling-temporaries.html). – fredoverflow May 15 '11 at 12:40
  • +1. Recycling of temporaries is a nice idea but it comes with its own problems (and is hardly useful in *this* case) which is why I'm totally agreeing with you. Greetings from Gö to HH. :) – sellibitze May 17 '11 at 11:44
  • @fredoverflow That link (in the comment) is dead. – user202729 Dec 29 '19 at 10:58
7

This, which already works wonderfully in current C++, will use move semantics (if available) in C++0x. It already handles all cases, but relies on copy elision and inlining to avoid copies – so it may make more copies than desired, particularly for the second parameter. The nice bit about this is it works without any other overloads and does the right thing (semantically):

vec operator+(vec a, vec const &b) {
  a += b;
  return a;  // "a" is local, so this is implicitly "return std::move(a)",
             // if move semantics are available for the type.
}

And this is where you would stop, 99% of the time. (I am likely underestimating that figure.) The rest of this answer only applies once you know, such as through the use of a profiler, that extra copies from op+ are worth further optimization.


To completely avoid all possible copies/moves, you would indeed need these overloads:

// lvalue + lvalue
vec operator+(vec const &a, vec const &b) {
  vec x (a);
  x += b;
  return x;
}

// rvalue + lvalue
vec&& operator+(vec &&a, vec const &b) {
  a += b;
  return std::move(a);
}

// lvalue + rvalue
vec&& operator+(vec const &a, vec &&b) {
  b += a;
  return std::move(b);
}

// rvalue + rvalue, needed to disambiguate above two
vec&& operator+(vec &&a, vec &&b) {
  a += b;
  return std::move(a);
}

You were on the right track with yours, with no real reduction possible (AFAICT), though if you need this op+ often for many types, a macro or CRTP could generate it for you. The only real difference (my preference for separate statements above is minor) is yours make copies when you add two lvalues in operator+(const vec& lhs, vec&& rhs):

return std::move(rhs + lhs);

Reducing duplication through CRTP

template<class T>
struct Addable {
  friend T operator+(T const &a, T const &b) {
    T x (a);
    x += b;
    return x;
  }

  friend T&& operator+(T &&a, T const &b) {
    a += b;
    return std::move(a);
  }

  friend T&& operator+(T const &a, T &&b) {
    b += a;
    return std::move(b);
  }

  friend T&& operator+(T &&a, T &&b) {
    a += b;
    return std::move(a);
  }
};

struct vec : Addable<vec> {
  //...
  vec& operator+=(vec const &x);
};

Now there's no longer a need to define any op+ specifically for vec. Addable is reusable for any type with op+=.

Fred Nurk
  • 13,952
  • 4
  • 37
  • 63
  • This does avoid copies if I write `vec() + vec() + vec()` (all r-values). But it doesn't if I write `vec o; o + o + o;` (all l-values). At least GCC 4.5 with -O3 can't optimize them away :( – R. Martinho Fernandes May 15 '11 at 05:18
  • @MartinhoFernandes: o + o + o requires at least one copy. How many do you see? – Fred Nurk May 15 '11 at 05:24
  • @Fred: Yes, I was expecting one copy. But o+o+o does three. – R. Martinho Fernandes May 15 '11 at 05:27
  • Thanks for noticing that bug! It was a typo. – R. Martinho Fernandes May 15 '11 at 05:51
  • [(chat bookmark)](http://chat.stackoverflow.com/transcript/message/718944#718944) – Fred Nurk May 15 '11 at 06:02
  • 3
    Downvoted. This answer is misleading and recommends dangerous code. – Howard Hinnant May 15 '11 at 13:31
  • 1
    @HowardHinnant: How is this answer misleading or the code dangerous? For example, see [this response](http://chat.stackoverflow.com/transcript/message/719667#719667) to an older discussion you participated in. Who is creating local references (not parameters) without knowing why they do it? Why are we concerned with protecting possibly/probably idiotic code? This strikes me as very similar to the argument for op= returning a const& (to "prevent" nonsense like (a = b) = c), which is widely frowned upon. – Fred Nurk May 15 '11 at 13:43
  • 2
    For the reasons FredOverflow says. This example should be written exactly as you would in C++03, though the defaulted members are a nice clarification. There are no resources to move here. Move semantics does not reduce the number of temporaries. It moves resources from one temporary to another. If you want to reduce the number of temporaries, go with expression templates: http://en.wikipedia.org/wiki/Expression_templates . – Howard Hinnant May 15 '11 at 13:45
  • @HowardHinnant: Even though I'm using std::move, it's not move semantics -- yet, when the result (the rvalue-ref) is consumed by the user of op+, then move semantics apply. – Fred Nurk May 15 '11 at 13:48
  • 1
    @HowardHinnant: Fred Overflow *doesn't give* any reasons. His answer just says "gains you absolutely nothing in performance" (but that's wrong for expensive to copy/move types) and "don't do it." – Fred Nurk May 15 '11 at 13:49
  • 2
    Change `vec` so that it owns resources and then we can discuss move semantics. What you and the OP have coded is just unnecessarily obfuscated code, and if "deemed" correct will just add huge amounts of confusion to rvalue references. – Howard Hinnant May 15 '11 at 13:55
  • 1
    @HowardHinnant: That sounds like you are saying this shouldn't be used because rvalue-refs are too hard to understand. Yet you recommend expression templates which the average programmer definitely doesn't understand, much less will be able to implement. How is this "unnecessarily obfuscated"? Compared to the only other solution offered, expression templates, it is *clean, clear, and concise.* – Fred Nurk May 15 '11 at 14:07
  • In the OP's example (and your answer) the extra `operator+` overloads add no value. It is like writing `int* ip = &*&*&i;` when `int* ip = &i` gets the job done. – Howard Hinnant May 15 '11 at 14:17
  • 2
    @HowardHinnant: They do exactly what was asked: minimize the number of objects created. If you have a type with an expensive copy & move, you've got an extra copy gratuitously added. I'm thinking array for example. You don't want to accidentally introduce an extra copy. On the risk side, "const X& x = a + b + c;" is pretty rare. I don't think I've ever seen it. (This should [sound familiar](http://stackoverflow.com/questions/5770253/returning-a-rvalue-reference-in-c0x/5770888#comment-6942937).) – Fred Nurk May 15 '11 at 14:55
  • @Fred: Honestly, this is very funny!!! :-) Very nice debating trick. Due to lack of room to form a response, I'm putting it in an answer... – Howard Hinnant May 15 '11 at 15:31
  • 1
    @Fred: Your recent fix to your CRTP solution introduces one more copy than my fix to your CRTP solution. :-) This is because my solution allowed the compiler to RVO the answer whereas yours doesn't. – Howard Hinnant May 15 '11 at 15:57
  • @HowardHinnant: I knew better than to be clever, but I tried to piggyback on another op+ anyway. :( – Fred Nurk May 15 '11 at 16:01
  • A small question about this: The 3rd version (lvalue + rvalue) of your 'to remove all possible copies/moves' code snippet swaps the operands of `+=`. This assumes that `+` has been implemented as a commutative operation, right? (Two examples of non-commutative operations would be string concatenation and other operators like `-=` or `/=`.) How should the code be changed for non-commutative operators? – Qqwy Feb 04 '17 at 19:16
1

I coded up Fred Nurk's answer using clang + libc++. I had to remove the use of initializer syntax because clang doesn't yet implement that. I also put a print statement in the copy constructor so that we could count copies.

#include <iostream>

template<class T>
struct AddPlus {
  friend T operator+(T a, T const &b) {
    a += b;
    return a;
  }

  friend T&& operator+(T &&a, T const &b) {
    a += b;
    return std::move(a);
  }

  friend T&& operator+(T const &a, T &&b) {
    b += a;
    return std::move(b);
  }

  friend T&& operator+(T &&a, T &&b) {
    a += b;
    return std::move(a);
  }

};

struct vec
    : public AddPlus<vec>
{
    int v[3];

    vec() : v() {};
    vec(int x, int y, int z)
    {
        v[0] = x;
        v[1] = y;
        v[2] = z;
    };
    vec(const vec& that)
    {
        std::cout << "Copying\n";
        v[0] = that.v[0];
        v[1] = that.v[1];
        v[2] = that.v[2];
    }
    vec& operator=(const vec& that) = default;
    ~vec() = default;

    vec& operator+=(const vec& that)
    {
        v[0] += that.v[0];
        v[1] += that.v[1];
        v[2] += that.v[2];
        return *this;
    }
};

int main()
{
    vec v1(1, 2, 3), v2(1, 2, 3), v3(1, 2, 3), v4(1, 2, 3);
    vec v5 = v1 + v2 + v3 + v4;
}

test.cpp:66:22: error: use of overloaded operator '+' is ambiguous (with operand types 'vec' and 'vec')
    vec v5 = v1 + v2 + v3 + v4;
             ~~~~~~~ ^ ~~
test.cpp:5:12: note: candidate function
  friend T operator+(T a, T const &b) {
           ^
test.cpp:10:14: note: candidate function
  friend T&& operator+(T &&a, T const &b) {
             ^
1 error generated.

I fixed this error like so:

template<class T>
struct AddPlus {
  friend T operator+(const T& a, T const &b) {
    T x(a);
    x += b;
    return x;
  }

  friend T&& operator+(T &&a, T const &b) {
    a += b;
    return std::move(a);
  }

  friend T&& operator+(T const &a, T &&b) {
    b += a;
    return std::move(b);
  }

  friend T&& operator+(T &&a, T &&b) {
    a += b;
    return std::move(a);
  }

};

Running the example outputs:

Copying
Copying

Next I tried a C++03 approach:

#include <iostream>

struct vec
{
    int v[3];

    vec() : v() {};
    vec(int x, int y, int z)
    {
        v[0] = x;
        v[1] = y;
        v[2] = z;
    };
    vec(const vec& that)
    {
        std::cout << "Copying\n";
        v[0] = that.v[0];
        v[1] = that.v[1];
        v[2] = that.v[2];
    }
    vec& operator=(const vec& that) = default;
    ~vec() = default;

    vec& operator+=(const vec& that)
    {
        v[0] += that.v[0];
        v[1] += that.v[1];
        v[2] += that.v[2];
        return *this;
    }
};

vec operator+(const vec& lhs, const vec& rhs)
{
    return vec(lhs.v[0] + rhs.v[0], lhs.v[1] + rhs.v[1], lhs.v[2] + rhs.v[2]);
}

int main()
{
    vec v1(1, 2, 3), v2(1, 2, 3), v3(1, 2, 3), v4(1, 2, 3);
    vec v5 = v1 + v2 + v3 + v4;
}

Running this program produced no output at all.

These are the results I got with clang++. Interpret them how you may. And your milage may vary.

Howard Hinnant
  • 206,506
  • 52
  • 449
  • 577
  • 3
    You haven't reduced the number of objects created, merely hid them. Instead of copying through a copy constructor, you copy each data piece manually and create through a different constructor which has no output. Change to vec operator+(vec lhs, const vec& rhs) { lhs += rhs; return lsh; } and what do you get? – Fred Nurk May 15 '11 at 15:36
  • As stated in my answer, when I change to `vec operator+(vec lhs, const vec& rhs) { lhs += rhs; return lsh; }`, I get a compile-time error. What do you get? – Howard Hinnant May 15 '11 at 15:48
  • @HowardHinnant: I mean for your second piece of code, what you call the "03 approach". Use vec operator+(const vec& lhs, const vec& rhs) { vec x (lhs); x += rhs; return x; } there too, if you prefer. – Fred Nurk May 15 '11 at 15:52
  • I see, thanks for the clarification. Yes, those two options look equivalent to me. – Howard Hinnant May 15 '11 at 16:05
  • @Fred I think the whole point is that adding all those overloads will (in fact, I think they will) reduce temporaries, but there is no point in doing so when the type is flat. Introducing a temporary and initializing it with the `3` integer computed values doesn't seem to me to be more costly than taking an existing temporary (by one of those rvalue ref overloads) and modifying the 3 values of that temporary. In both cases you need to go over all 3 values. Now I haven't done elaborated tests on the matter, so I only say this out of my intuition. What you save is stack space. – Johannes Schaub - litb May 15 '11 at 16:09
  • @JohannesSchaub-litb: Whose point do you think that is? "All those overloads" is just three no matter how many data members there are, and I did clarify my answer that this optimization won't matter 99+% of the time; however, when it does matter, that += will be optimized (e.g. instead of an array of 3, it's 300, and using SSE) while the additional stack may start to exceed caches, etc. – Fred Nurk May 15 '11 at 16:16