27

Return value optimization (RVO) is an optimization technique involving copy elision, which eliminates the temporary object created to hold a function's return value in certain situations. I understand the benefit of RVO in general, but I have a couple of questions.

The standard says the following about it in §12.8, paragraph 32 of this working draft (emphasis mine).

When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the copy/move constructor and/or destructor for the object have side effects. In such cases, the implementation treats the source and target of the omitted copy/move operation as simply two different ways of referring to the same object, and the destruction of that object occurs at the later of the times when the two objects would have been destroyed without the optimization.

It then lists a number of criteria when the implementation may perform this optimization.


I have a couple of questions regarding this potential optimization:

  1. I am used to optimizations being constrained such that they cannot change observable behaviour. This restriction does not seem to apply to RVO. Do I ever need to worry about the side effects mentioned in the standard? Do corner cases exist where this might cause trouble?

  2. What do I as a programmer need to do (or not do) to allow this optimization to be performed? For example, does the following prohibit the use of copy elision (due to the move):

std::vector<double> foo(int bar){
    std::vector<double> quux(bar,0);
    return std::move(quux);
}

Edit

I posted this as a new question because the specific questions I mentioned are not directly answered in other, related questions.

Marc Claesen
  • 16,778
  • 6
  • 27
  • 62
  • possible duplicate of [What are copy elision and return value optimization?](http://stackoverflow.com/questions/12953127/what-are-copy-elision-and-return-value-optimization) – RedX Nov 05 '13 at 15:11
  • Hmm maybe not a duplicate. You better refrase the question. – RedX Nov 05 '13 at 15:12
  • 1
    @RedX I will change the title. It's not really a duplicate but I see why you suspected it to be. – Marc Claesen Nov 05 '13 at 15:14
  • 1
    There are two possible optimizations for `auto x = foo(42);` 1) Copy/move from `quux` to the return value temporary. 2) Copy/move from the return value temporary to `x`. The first here is NRVO, and happens only if the expression in the return-statement is a *name* (i.e. `move(quux)` prohibits that optimization). The second one can still be applied. – dyp Nov 05 '13 at 15:27

4 Answers4

14

I am used to optimizations being constrained such that they cannot change observable behaviour.

This is correct. As a general rule -- known as the as-if rule -- compilers can change code if the change is not observable.

This restriction does not seem to apply to RVO.

Yes. The clause quoted in the OP gives an exception to the as-if rule and allows copy construction to be omitted, even when it has side effects. Notice that the RVO is just one case of copy-elision (the first bullet point in C++11 12.8/31).

Do I ever need to worry about the side effects mentioned in the standard?

If the copy constructor has side effects such that copy elision when performed causes a problem, then you should reconsider the design. If this is not your code, you should probably consider a better alternative.

What do I as a programmer need to do (or not do) to allow this optimization to be performed?

Basically, if possible, return a local variable (or temporary) with the same cv unqualified type as the function return type. This allows RVO but doens't enforce it (the compiler might not perform RVO).

For example, does the following prohibit the use of copy elision (due to the move):

// notice that I fixed the OP's example by adding <double>
std::vector<double> foo(int bar){
    std::vector<double> quux(bar, 0);
    return std::move(quux);
}

Yes, it does because you're not returning the name of a local variable. This

std::vector<double> foo(int bar){
    std::vector<double> quux(bar,0);
    return quux;
}

allows RVO. One might be worried that if RVO is not performed then moving is better than coping (which would explain the use of std::move above). Don't worry about that. All major compilers will do the RVO here (at least in release build). Even if a compiler doesn't do RVO but the conditions for RVO are met then it will try to do a move rather than a copy. In summary, using std::move above will certainly make a move. Not using it will likely neither copy nor move anything and, in the worst (unlikely) case, will move.

(Update: As haohaolee's pointed out (see comments), the following paragraphs are not correct. However, I leave them here because they suggest an idea that might work for classes that don't have a constructor taking a std::initializer_list (see the reference at the bottom). For std::vector, haohaolee found a workaround.)

In this example you can force the RVO (strict speaking this is no longer RVO but let's keep calling this way for simplicity) by returning a braced-init-list from which the return type can be created:

std::vector<double> foo(int bar){
    return {bar, 0}; // <-- This doesn't work. Next line shows a workaround:
    // return {bar, 0.0, std::vector<double>::allocator_type{}};
}

See this post and R. Martinho Fernandes's brilliant answer.

Be carefull! Have the return type been std::vector<int> the last code above would have a different behavior from the original. (This is another story.)

Community
  • 1
  • 1
Cassio Neri
  • 19,583
  • 7
  • 46
  • 68
  • I'm curious about the other story. I know that if it was vector, { bar, 0 } would create a vector with 2 elements, but how can I force a multiple parameters call not a initializer list call? – haohaolee Nov 12 '13 at 01:28
  • And for `std::vector`, I didn't get it working on GCC 4.8.1. It creates a vector with two elements bar and 0 unless adding a third parameter `std::vector::allocator_type()` – haohaolee Nov 12 '13 at 04:21
  • @haohaolee: You're right. It doesn't work as I expected because the constructor taking `std::initializer_list` takes precedence. Your workaround to change this priority works. However, it's a bit weird to invove `allocator_type` to control overload resolution. I'm not blaming your approch, it's more a C++ (core and library) issue. I'll update the post to reflect these new findings. Thanks. – Cassio Neri Nov 12 '13 at 11:35
  • Hi, unfortunately it still has a problem, `int bar` should be `std::vector::size_type bar`, because braced-init-list doesn't allow narrowing conversion (gcc gives warnings about this) – haohaolee Nov 13 '13 at 02:00
  • @haohaolee I know but I hesitated using `std::vector::size_type` to not diverge too much from the OP. :-( – Cassio Neri Nov 13 '13 at 09:23
5

I highly recommend reading "Inside the C++ Object Model" by Stanely B. Lippman for detailed information and some historical backround on how the named return value optimization works.

For example, in chapter 2.1 he has this to say about named return value optimization:

In a function such as bar(), where all return statements return the same named value, it is possible for the compiler itself to optimize the function by substituting the result argument for the named return value. For example, given the original definition of bar():

X bar() 
{ 
   X xx; 
   // ... process xx 
   return xx; 
} 

__result is substituted for xx by the compiler:

void 
bar( X &__result ) 
{ 
   // default constructor invocation 
   // Pseudo C++ Code 
   __result.X::X(); 

   // ... process in __result directly 

   return; 
}

(....)

Although the NRV optimization provides significant performance improvement, there are several criticisms of this approach. One is that because the optimization is done silently by the compiler, whether it was actually performed is not always clear (particularly since few compilers document the extent of its implementation or whether it is implemented at all). A second is that as the function becomes more complicated, the optimization becomes more difficult to apply. In cfront, for example, the optimization is applied only if all the named return statements occur at the top level of the function. Introduce a nested local block with a return statement, and cfront quietly turns off the optimization.

Paul
  • 7,836
  • 2
  • 41
  • 48
4

It states it pretty clear, doesn't it? It allows to omit ctor with side effects. So you should never have side effects in ctors or if you insist, you should use techniques which eliminate (N)RVO. As to the second I believe it prohibits NRVO since std::move produces T&& and not T which would be candidate for NRVO(RVO) because std::move removes name and NRVO requires it(thanks to @DyP comment).

Just tested the following code on MSVC:

#include <iostream>

class A
{
public:
    A()
    {
        std::cout << "Ctor\n";
    }
    A(const A&)
    {
        std::cout << "Copy ctor\n";
    }
    A(A&&)
    {
        std::cout << "Move\n";
    }

};

A foo()
{
    A a;
    return a;
}

int main() 
{
    A a = foo();
    return 0;
}

it produces Ctor, so we have lost side effects for move ctor. And if you add std::move to foo() you will have NRVO eliminated.

ixSci
  • 13,100
  • 5
  • 45
  • 79
  • Same behavior with g++ 4.7 ... as expected. – thokra Nov 05 '13 at 15:27
  • 2
    The question is regarding guarantees of the standard. While a sample from a popular vendor is useful, it doesn't seem to indicate behavior that one could rely on. – Brian Cain Nov 05 '13 at 15:30
  • 1
    *"As to the second I believe it prohibits NRVO since std::move produces `T&&` and not `T` which would be candidate for NRVO(RVO)"* References are dropped before analysing an expression [expr]/5. The reason why it's prohibited is that a *name* of an object is required for NRVO. – dyp Nov 05 '13 at 15:30
  • @BrianCain, OP cited standard himself, I've just put it into example. – ixSci Nov 05 '13 at 16:08
0
  1. This is probably obvious but if you avoid writing copy/move constructors with side effects (most have no need for them) then the problem is totally moot. Even in simple side effect cases like construction/destruction counting it should still be fine. The only case to possibly worry is complicated side effects and that's a strong design smell to re-examime your code.

  2. This sounds like premature optimization to me. Just write the obvious, easily maintainable code, and let the compiler optimize. Only if profiling shows that certain areas are performing poorly should you consider adopting changes to improve performance.

Mark B
  • 95,107
  • 10
  • 109
  • 188