Why isn't RVO / NRVO always applied?

Question

A brief (and possibly dated and over-simplified) summary of the return value optimization mechanics reads like this:

an implementation may create a hidden object in the caller's stack frame, and pass the address of this object to the function. The function's return value is then copied into the hidden object (...) Around 1991, Walter Bright invented a technique to minimize copying, effectively replacing the hidden object and the named object inside the function with the object used for holding the result [1]

Since it's a topic greatly discussed on SO, I'll only link the most complete QA I found.

My question is, why isn't the return value optimization always applied? More specifically (based on the definition in [1]) why doesn't this replacement always happen per function call, since function return types (hence size on stack) are always known at compile time and this seems to be a very useful feature.

As it says [here](http://stackoverflow.com/a/12953145/1413395), that an implementaition _is allowed_ to do this, doesn't necessarily mean you can actually rely on it. Or did I misunderstood your question? — πάντα ῥεῖ, Dec 30 '14 at 17:28
@πάνταῥεῖ Your comment is correct. I'm just trying to understand what are the technical reasons behind this being "not always implementable". From what I had read in the descriptions (I was never involved in compiler construction - obviously) it looked feasible, but by reading the answers I can see various types of reasons that prohibit its omnipresence : core languge / value semantics / program logic / exceptions ... still trying to wrap my mind around any of these — Lorah Attkins, Dec 30 '14 at 18:04
Well, some compilers may support it, others not. There's still a lot of poor c++ compiler implementations around (especially for particular more or less exotic CPU architectures in the embedded field). If the spec says some implementation behavior is optional, that might have a broader acceptance from compiler manufacturers, and letting them get just ahead to meet newer standards. — πάντα ῥεῖ, Dec 30 '14 at 18:11

score 5 · Accepted Answer · edited Dec 30 '14 at 17:39

5

Obviously, when an lvalue is returned by value, there is no way to not do a copy. So, let's consider only local variables. A simple reason applying to local variables is that often it is unclear which object is to be returned. Consider code like this:

T f(Args... args) {
    T v1{some_init(args)};
    T v2{some_other(args)};
    bool rc = determine_result(v1, v2);
    return rc? v1: v2;
}

At the point the local variable v1 and v2 are created the compiler has no way to tell which one is going to be returned so it can be created in place.

Another reason is that copy/move construction and destruction can have deliberate side-effects. Thus, it is desirable to have ways to inhibit copy-elision. At the time copy-elision was introduced there was already a lot of C++ code around which may depend on certain copies to be made, i.e., only few situations were made eligible to copy elision.

edited Dec 30 '14 at 17:39

Columbo

60,038
8
155
203

answered Dec 30 '14 at 17:36

Dietmar Kühl

150,225
13
225
380

I think you need `if (rc) return v1; else return v2;` for RVO to be allowed at all. Pretending you did write that, of course, if the compiler can determine what `rc` will evaluate to (for example, if `determine_result` doesn't actually look at either `v1` or `v2`, but simply always returns true, because of some compile-time configuration option), then RVO is still possible and allowed even here. – Dec 30 '14 at 17:38
@hvd: I don't think _copy-elision_ (no, you won't get me to incorrectly call it :-) is allowed in the above case! See 12.8 [class.copy] paragraph 31 which allows copy-elision for a `return` statement only when the expression is a variable name (bullet 1) or when a temporary object is returned (bullet 3). – Dietmar Kühl Dec 30 '14 at 17:42
Yes, that's right, I had already edited my comment to cover that: you can meet the RVO requirements (don't worry, I'm not asking you to call it that) by changing the code to use an `if` statement with two separate `return` statements in the two branches. – Dec 30 '14 at 17:57

score 1 · Answer 2 · answered Dec 30 '14 at 17:33

1

Requiring that the implementation do this could be a de-optimization in certain circumstances, such as if the return value were thrown away. If you start adding exceptions it starts becoming difficult to prove that an implementation is correct.

Instead they take the easy way and let the implementation decide when to do the optimization, and when it would be counter-productive to do it.

answered Dec 30 '14 at 17:33

Mark B

95,107
10
109
188

1

Note that the compiler needs to make a copy or move when it doesn't do copy elision! Since the copy/move or the destructor (of the copied/moved object) can have side-effects it can't not just not do it! That is, copy-elision is **always** doing less work. – Dietmar Kühl Dec 30 '14 at 17:46
1

@DietmarKühl I think you might have one too many *not*s in there. – T.C. Dec 30 '14 at 17:53
@T.C.: you are right: I didn't account for the negation in "can't"... Assume the statement is adjusted using `s/can't/can/` (with the regular expression appropriately quoted...). – Dietmar Kühl Dec 30 '14 at 17:56

Why isn't RVO / NRVO always applied?

2 Answers2