3

Original code

#include <iostream>
int global;
struct A
{
   A(){}
   A(const A&x){
       ++global;
   }
   ~A(){}
};
A foo()
{  
     A a;
     return a;  
}
int main()
{
   A x = foo();
   std::cout << global;
}

Output would be 0 on an optimized compiler supporting Named Return Value Optimization.

When I change the definition of foo to

A foo()
{ 
  { 
     A a;
     return a;  
  }
}

I get 1 as the output i.e copy c-tor gets called once. What could be the possible reason? Introducing a dummy scope changes the behavior of the code completely. What am I missing?

I tested it on g++ compiler. Any compiler guy around here who can explain the scenario in some implementation-specific manner?

EDIT

I tested it on clang and it optimizes the call to the copy c-tor even in the second case.

Andrew Pinski (gcc guy) confirmed that this is indeed a case of missed optimization on g++.

Prasoon Saurav
  • 91,295
  • 49
  • 239
  • 345
  • 3
    Such optimisations may be applied or not at a compiler's whim. Trying to reason about them is ... folly. No? – Lightness Races in Orbit Dec 15 '11 at 15:40
  • When you say "Output would be 0 on an optimized compiler" did you observe this, or just assume it? – patros Dec 15 '11 at 15:42
  • @patros : My assumption and observation are just the same. :) – Prasoon Saurav Dec 15 '11 at 15:43
  • @PrasoonSaurav: Well the difference is that you observed it in one instance, then assumed it for all. – Lightness Races in Orbit Dec 15 '11 at 15:43
  • I think this is one of the reasons why "move semantics" were introduced in c++0x .. a standardized way and also a guarantee :) – Arunmu Dec 15 '11 at 15:44
  • 2
    Because the compiler needs to be very strict about applying optimizations. If any of its pre-condiftions fail then the optimization can not be applied. It may seem obvious to a human that it can do it here but adding an extra scope in the general case adds some horundus complexity the the pre-conditiond checks that fail. The only good answer here is @Tomalak in that optimizations are done or not done at the compilers choice (heuristics). – Martin York Dec 15 '11 at 15:54
  • @TomalakGeret'kal : I didn't assume it for all instances. Nobody can. I was just puzzled to see the difference in behavior on g++ by just introducing `{}`. – Prasoon Saurav Dec 15 '11 at 16:08
  • 1
    @Prasoon: OK. `Output would be 0 on an optimized compiler` implied to me that you were expecting a reproducible result on an _arbitrary_ optimized compiler, as opposed to just one run on your PC. – Lightness Races in Orbit Dec 15 '11 at 16:15
  • Man, don't continue this on SO, bring it to GCC's bugzilla! :) – Kos Dec 15 '11 at 16:26
  • It's funny; you actually shared [this question](http://stackoverflow.com/questions/8410877/is-rvo-return-value-optimization-on-unnamed-objects-a-universally-guaranteed-b) on Facebook a week ago. – Lightness Races in Orbit Dec 15 '11 at 17:30
  • @TomalakGeret'kal : That was a completely different question. My answer was `Modern compilers are intelligent enough to do such kind of optimization.` This is altogether a different scenario. – Prasoon Saurav Dec 15 '11 at 17:37
  • 1
    @Prasoon: It asks whether RVO is "universally guaranteed", and the answer given is "no". A week later you post this, apparently surprised that your RVO vanished under some situation. Lol! – Lightness Races in Orbit Dec 15 '11 at 17:38
  • @TomalakGeret'kal : This question doesn't ask for cross implementation behavior. More importantly I don't find this funny at all. – Prasoon Saurav Dec 15 '11 at 17:40
  • 1
    @PrasoonSaurav: (a) RVO _is_ cross-implementation behaviour. (b) That's too bad; I find a sense of humour to be crucial to a fruitful existence. – Lightness Races in Orbit Dec 15 '11 at 17:45
  • @TomalakGeret'kal : I am simply asking why adding a scope `{}` changes the behavior of the code under the same implementation i.e g++. :-) – Prasoon Saurav Dec 15 '11 at 17:47
  • @PrasoonSaurav: Refer to my initial comment. – Lightness Races in Orbit Dec 15 '11 at 17:49

5 Answers5

3

I don't see any reason other than that the compiler is not being smart enough to see that the dummy scope (introduced by extra brackets) makes no difference at all, at least for this particular code. The compiler is being fooled by the extra brackets; it probably made the wild assumptions about the rest of the function body (which doesn't even exist).

Zero or 1, either way the behavior is completely Standard comformant, as the Standard doesn't require the compiler to produce 0 (or 1 for that matter). So it is upto the compiler, as you already know.

As for the assembly code generated for foo in both cases has just one little difference:

  • First code:

    __Z3foov:
    LFB992:
         .cfi_startproc
         movl  4(%esp), %eax
         ret   $4
         .cfi_endproc
    
  • Second code:

    __Z3foov:
    LFB992:
         .cfi_startproc
         incl _global      <----- incrementing the global. God knows why!
         movl  4(%esp), %eax
         ret   $4
         .cfi_endproc
    

I used g++ -O6. Version : MinGW (GCC) 4.6.1

Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • 1
    @Downvoter: Please specify the reason. Or even better post an answer, as it seems you know *the rationale* behind the output in the second case. – Nawaz Dec 15 '11 at 16:35
  • @PrasoonSaurav: Why? What atheism/theism has to do with this? :| (or you're implying that atheists are irrational, and so are able to provide rationale even for irrationality? :P) – Nawaz Dec 15 '11 at 18:15
  • 1
    @Nawaz : `incrementing the global. God knows why!` :) – Prasoon Saurav Dec 15 '11 at 18:16
2

Theoretically you could get 1 in both situations. This is one of the situations when the compiler may or may not optimize the copy constructor away.

You can find more detailed information on the subject by googling for "Return value optimization" and "Named return value optimization", in your case the latter.

Note if you change the code to:

A foo()
{ 
  { 
     return A();
  }
}

then the RVO should kick in and you'll obtain 0 on the output.

Why didn't NRVO kick in in the case you described? (I've confirmed this on GCC 4.6.) I'm not sure at this point; either the compiler isn't smart enough or there's a rule about NRVO which disallows it here.


Edit:

The standard says...

When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the copy/move constructor and/or destructor for the object have side effects. (...) This elision of copy/move operations, called copy elision, is permitted in the following circumstances:

— in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type , the copy/move operation can be omitted by constructing the automatic object directly into the function’s return value

Hence it's permitted here, but the compiler wasn't smart enough to perform NRVO here. If you work on GCC, you can check if it's the same on Clang and see if the result is different (my gut feeling says it will).

Note that RVO and NRVO are names for compiler features, while "copy elision" is how the standard refers to this behaviour in general.

Community
  • 1
  • 1
Kos
  • 70,399
  • 25
  • 169
  • 233
2

With which compiler? Both results are legal, so formally, you can't complain. Practically, I can't see why introducing the scope should change anything; as a quality of implementation issue, I think you could complain.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
0

That a compiler may miss some cases an optimization is possible even if it does it in other cases. AFAIK -- and I've checked now the place which allows the copy constructor elisions -- there is no constraint that the variable must be int the top scope in the function, it must just be an automatic variable.

AProgrammer
  • 51,233
  • 8
  • 91
  • 143
-5

There will be a copy as you return by value. Optimizing settings can (should) never change the visible behaviour of a program.

paul23
  • 8,799
  • 12
  • 66
  • 149
  • http://en.wikipedia.org/wiki/Return_value_optimization -- Note the second sentence in the opening paragraph. – Benjamin Lindley Dec 15 '11 at 15:37
  • 3
    This is actually not true. You'd think it is, but it's not. Some copies can be elided despite altering side-effects. – Lightness Races in Orbit Dec 15 '11 at 15:39
  • In this case, the standard explicitly gives the compiler the right to ignore observable behavior from the copy constructor. – James Kanze Dec 15 '11 at 15:49
  • The standard allows copy constructors be removed (even if they have side effects) as long as the rest of the code does not change. Thus here the compiler can use RVO and build the return value at the call site thus removing the need for a copy. – Martin York Dec 15 '11 at 15:52