1

Consider the following code:

LargeObject getLargeObject()
{
    LargeObject glo;
    // do some initialization stuff with glo
    return glo;
}

void test()
{
    LargeObject tlo = getLargeObject();
    // do sth. with tlo;
}

A simple compiler would create a local LargeObject glo on the getLargeObject() stack and then assign it to tlo in test() when returning, which involves a copy operation.

But shouldn't a clever compiler realize that glo is going to be assigned to tlo and thus just use tlo's memory in the first place to avoid the copy operation? Resulting in something (functionally) like:

void getLargeObject(LargeObject &lo)
{
    // do init stuff
}

void test()
{
    LargeObject lo;
    getLargeObject(lo);
}

My guess is, that compilers do something similar. But can it always be done? Are there situations where it can't be optimized like that? How can I know if my return value is copied or not?

John Dibling
  • 99,718
  • 31
  • 186
  • 324
Ben
  • 4,486
  • 6
  • 33
  • 48

3 Answers3

4

Your guess is correct. And yes, there are situations where it cannot be done, for example:

LargeObject getLargeObject()
{
    LargeObject glo1, glo2;
    // do some initialization stuff         
    if (rand() % 2)
        return glo1;
    return glo2;
}

It can't be done there because the compiler can't know whether it will use glo1 or glo2 for the return value.

"How can I know if my return value is copied or not?"

Two ways I can think of. You could create noisy copy constructors. That is, copy constructors that have some detectable side effect, like printing a message. Then of course there's the old look at the assembly.

Benjamin Lindley
  • 101,917
  • 9
  • 204
  • 274
  • There are even cases where it can do so here. RVO is part of the ABI on many platforms, meaning that the caller reserves the space of the return variable and passes the address. A good compiler will then construct an object at that place at the latest point possible, and when tracing back the execution path, under certain circumstances (e.g. no side-effects of the ctor that need to be considered for that purpose) it can first check the if, and then construct the object in-place. Admittedly, these cases are rare, but compilers these days are improving even here. – PlasmaHH Nov 22 '11 at 15:06
  • 1
    More generally, if there is more than one `return` in the function, some compilers will fail to do RVO. And of course, if you use both `glo1` and `glo2` within the function, in a way the requires two objects, and sometimes return one, sometimes the other, there's no way the compiler can avoid the copy. – James Kanze Nov 22 '11 at 15:42
  • It is important to note, James Kanze mentions it in his answer, but it is missing here, that there are actually *two* copies, one from the `gloX` to the returned temporary and a separate one from the temporary to the object constructed in `test`. The second copy is what is handled in the ABI as PlasmaHH mentions, and is *always* performed (in all platforms I know), but the first copy depends on the compiler being able to know which of `glo1` or `glo2` will be returned when constructing the object (More [here](http://definedbehavior.blogspot.com/2011/08/value-semantics-nrvo.html)) – David Rodríguez - dribeas Nov 22 '11 at 15:56
2

Yes it should. This is called the named returned value optimization (NRVO or just RVO).

John
  • 2,326
  • 1
  • 19
  • 25
  • Yes, in this particular case, it's NRVO since there's a named variable that is returned. RVO is when an instance is created and returned in the return statement. – Johann Gerell Nov 22 '11 at 16:41
2

For starters, even a naïve compiler will not “assign to tlo”, since the standard doesn't allow it. The formal semantics of your code involves two copies (both using the copy constructor); the first from glo to a temporary return value, and the second from this temporary return value to tlo. The standard, however, formally gives compilers the right to eliminate both of these copies, in this specific case, and practically speaking, I imagine that all compilers do.

The first copy can be suppressed anytime you return a local variable or a temporary; some compilers don't do it if there is more than one return in the code, however (but that will never be the case in well written code).

The suppression of the second copy depends on the fact that you are constructing a new object at the call site. If you're not constructing a new object, then there may not even be a second copy to suppress; e.g. in a case like getLargeObject().memberFunction(). If you're assigning to an existing object, however, there's not much the compiler can do; it must call the assignment operator. If the assignment operator copies, then you get that copy.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • Wow, there's some interesting informations in that answer! I'm not sure though if I get it all right. The first and last sentence seem contradictory to me. Or is it just a language thing? – Ben Nov 22 '11 at 16:00
  • @James: *but that will never be the case in well written code* - Wow - no politics at SO! :) – Johann Gerell Nov 22 '11 at 16:42
  • @Ben The problem is that in the first sentence, I'm using _copy_ in the sense of invoking the copy constructor---I should have been more precise. In the last sentence, I'm using it in its more general sense: an assignment operator might "copy" a lot of data. (In fact, a typical assignment operator for a large object might copy construct a new instance, then swap the data.) – James Kanze Nov 22 '11 at 17:11
  • @James: still not clear :( You're not using 'copy' in your first sentence. What's not allowed in the standard? Does RVO only get rid of the first copy operation? If I passed a reference to the called function there would be no copy operation at all... can't it be optimized in a similar way? – Ben Nov 22 '11 at 17:29
  • @Ben RVO basically only gets rid of the first copy. The second copy depends on what you're doing at the call site. If you're constructing a new object, the compiler can arrange for the returned temporary to be in its place. If you're doing something else, it generally can't. What's not allowed by the standard is to not call `operator=` in an assignment; and if you do a copy inside `operator=` (a frequent situation), then this copy cannot be optimized out. – James Kanze Nov 22 '11 at 17:54