Efficient use of move semantics together with (N)RVO

Question

Let's say I want to implement a function that is supposed to process an object and return a new (possibly changed) object. I would like to do this as efficient as possible in C+11. The environment is as follows:

class Object {
    /* Implementation of Object */
    Object & makeChanges();
};

The alternatives that come to my mind are:

// First alternative:
Object process1(Object arg) { return arg.makeChanges(); }
// Second alternative:
Object process2(Object const & arg) { return Object(arg).makeChanges(); }
Object process2(Object && arg) { return std::move(arg.makeChanges()); }
// Third alternative:
Object process3(Object const & arg) { 
    Object retObj = arg; retObj.makeChanges(); return retObj; 
}
Object process3(Object && arg) { std::move(return arg.makeChanges()); }

Note: I would like to use a wrapping function like process() because it will do some other work and I would like to have as much code reuse as possible.

Updates:

I used the makeChanges() with the given signature because the objects I am dealing with provides methods with that type of signature. I guess they used that for method chaining. I also fixed the two syntax errors mentioned. Thanks for pointing those out. I also added a third alternative and I will repose the question below.

Trying these out with clang [i.e. Object obj2 = process(obj);] results in the following:

First option makes two calls to the copy constructor; one for passing the argument and one for returning. One could instead say return std::move(..) and have one call to the copy constructor and one call to the move constructor. I understand that RVO can not get rid of one of these calls because we are dealing with the function parameter.

In the second option, we still have two calls to the copy constructor. Here we make one explicit call and one is made while returning. I was expecting for RVO to kick in and get rid of the latter since the object we are returning is a different object than the argument. However, it did not happen.

In the third option we have only one call to the copy constructor and that is the explicit one. (N)RVO eliminates the copy constructor call we would do for returning.

My questions are the following:

(answered) Why does RVO kick in the last option and not the second?
Is there a better way to do this?
Had we passed in a temporary, 2nd and 3rd options would call a move constructor while returning. Is is possible to eliminate that using (N)RVO?

Thanks!

Why would `makeChanges` return an `Object&`? It should either return nothing and be a mutating function or it should be `const` and return a new object by value. Currently, because it is _not_ `const`, your first and second listed options are not even compilable because you're calling a non-const member function on a const object. Effectively this makes your question pretty much nonsensical without a rationale for `makeChanges`' current signature. — ildjarn, Mar 31 '12 at 03:28
@ildjarn: Thanks for the comments. I made changes and reposed the question. I guess I am still not clear about how/when RVO kicks in. I'd love to hear your ideas and suggestions. — iheap, Mar 31 '12 at 06:50

score 18 · Accepted Answer · edited Oct 09 '13 at 15:27

I like to measure, so I set up this Object:

#include <iostream>

struct Object
{
    Object() {}
    Object(const Object&) {std::cout << "Object(const Object&)\n";}
    Object(Object&&) {std::cout << "Object(Object&&)\n";}

    Object& makeChanges() {return *this;}
};

And I theorized that some solutions may give different answers for xvalues and prvalues (both of which are rvalues). And so I decided to test both of them (in addition to lvalues):

Object source() {return Object();}

int main()
{
    std::cout << "process lvalue:\n\n";
    Object x;
    Object t = process(x);
    std::cout << "\nprocess xvalue:\n\n";
    Object u = process(std::move(x));
    std::cout << "\nprocess prvalue:\n\n";
    Object v = process(source());
}

Now it is a simple matter of trying all of your possibilities, those contributed by others, and I threw one in myself:

#if PROCESS == 1

Object
process(Object arg)
{
    return arg.makeChanges();
}

#elif PROCESS == 2

Object
process(const Object& arg)
{
    return Object(arg).makeChanges();
}

Object
process(Object&& arg)
{
    return std::move(arg.makeChanges());
}

#elif PROCESS == 3

Object
process(const Object& arg)
{
    Object retObj = arg;
    retObj.makeChanges();
    return retObj; 
}

Object
process(Object&& arg)
{
    return std::move(arg.makeChanges());
}

#elif PROCESS == 4

Object
process(Object arg)
{
    return std::move(arg.makeChanges());
}

#elif PROCESS == 5

Object
process(Object arg)
{
    arg.makeChanges();
    return arg;
}

#endif

The table below summarizes my results (using clang -std=c++11). The first number is the number of copy constructions and the second number is the number of move constructions:

+----+--------+--------+---------+
|    | lvalue | xvalue | prvalue |    legend: copies/moves
+----+--------+--------+---------+
| p1 |  2/0   |  1/1   |   1/0   |
+----+--------+--------+---------+
| p2 |  2/0   |  0/1   |   0/1   |
+----+--------+--------+---------+
| p3 |  1/0   |  0/1   |   0/1   |
+----+--------+--------+---------+
| p4 |  1/1   |  0/2   |   0/1   |
+----+--------+--------+---------+
| p5 |  1/1   |  0/2   |   0/1   |
+----+--------+--------+---------+

process3 looks like the best solution to me. However it does require two overloads. One to process lvalues and one to process rvalues. If for some reason this is problematic, solutions 4 and 5 do the job with only one overload at the cost of 1 extra move construction for glvalues (lvalues and xvalues). It is a judgement call as to whether one wants to pay an extra move construction to save overloading (and there is no one right answer).

(answered) Why does RVO kick in the last option and not the second?

For RVO to kick in, the return statement needs to look like:

return arg;

If you complicate that with:

return std::move(arg);

or:

return arg.makeChanges();

then RVO gets inhibited.

Is there a better way to do this?

My favorites are p3 and p5. My preference of p5 over p4 is merely stylistic. I shy away from putting move on the return statement when I know it will be applied automatically for fear of accidentally inhibiting RVO. However in p5 RVO is not an option anyway, even though the return statement does get an implicit move. So p5 and p4 really are equivalent. Pick your style.

Had we passed in a temporary, 2nd and 3rd options would call a move constructor while returning. Is is possible to eliminate that using (N)RVO?

The "prvalue" column vs "xvalue" column addresses this question. Some solutions add an extra move construction for xvalues and some don't.

score 2 · Answer 2 · answered Mar 31 '12 at 04:06

None of the functions you show will have any significant return value optimizations on their return values.

makeChanges returns an Object&. Therefore, it must be copied into a value, since you're returning it. So the first two will always make a copy of the value to be returned. In terms of the number of copies, the first one makes two copies (one for the parameter, one for the return value). The second one makes two copies (one explicitly in the function, one for the return value.

The third one shouldn't even compile, since you can't implicitly convert an l-value reference into an r-value reference.

So really, don't do this. If you want to pass an object, and modify it in-situ, then just do this:

Object &process1(Object &arg) { return arg.makeChanges(); }

This modifies the provided object. No copying or anything. Granted, one might wonder why process1 isn't a member function or something, but that doesn't matter.

Thanks for the answer. I see why the second alternative still needs a copy now. However I do not understand how your proposed solution fits in. The function I'd like needs to generate a new `Object`. — iheap, Mar 31 '12 at 07:02

score 0 · Answer 3 · answered Mar 31 '12 at 07:41

0

The fastest way to do this is- if the argument is lvalue, then copy it and return that copy- if rvalue, then move it. The return can always be moved or have RVO/NRVO applied. This is easily accomplished.

Object process1(Object arg) {
    return std::move(arg.makeChanges());
}

This is very similar to the canonical C++11 forms of many kinds of operator overloads.

answered Mar 31 '12 at 07:41

Puppy

144,682
38
256
465

When I tried that I see that move constructor is always called. RVO does not kick in. In light of the first answer I thought this was due to the fact that `std::move` gives us an (r-value) reference yet we return by value. Are you saying RVO should (or could?) have kicked in? If not, `process3` seems to be slightly more efficient (no call to move constructor). – iheap Mar 31 '12 at 08:18
@iheap: RVO/NRVO absolutely can kick in for this case. Of course, the exact situations in which it can be applied are, at best, implementation-dependent. All you can do is permit it- it's up to the compiler to actually do it. – Puppy Mar 31 '12 at 11:11
3

@DeadMG: There's no RVO for this, because the return expression is an r-value reference. Since the return type is `Object`, the return value must be constructed from the r-value reference. RVO isn't the same thing as moving a temporary; it means completely eliding the copy/move out of the function. And you can't do that with the code you've given here. – Nicol Bolas Mar 31 '12 at 16:50

Efficient use of move semantics together with (N)RVO

3 Answers3

Linked