2

Stumbled upon few articles claiming that passing by value could improve performance if function is gonna make a copy anyway.

I never really thought about how pass-by-value might be implemented under the hood. Exactly what happens on stack when you do smth like this: F v = f(g(h()))?

After pondering a bit I came to conclusion that I'd implement it in such way that value returned by g() is created in locations where f() expects it to be. So, basically, no copy/move constructor calls -- f() will simply take ownership of object returned by g() and destroy it when execution leaves f()'s scope. Same for g() -- it'll take ownership of object returned by h() and destroy it on return.

Alas, compilers seem to disagree. Here is the test code:

#include <cstdio>

using std::printf;

struct H
{
    H() { printf("H ctor\n"); }
    ~H() { printf("H dtor\n"); }
    H(H const&) {}
//    H(H&&) {}
//    H(H const&) = default;
//    H(H&&) = default;
};

H h() { return H(); }

struct G
{
    G() { printf("G ctor\n"); }
    ~G() { printf("G dtor\n"); }
    G(G const&) {}
//    G(G&&) {}
//    G(G const&) = default;
//    G(G&&) = default;
};

G g(H) { return G(); }

struct F
{
    F() { printf("F ctor\n"); }
    ~F() { printf("F dtor\n"); }
};

F f(G) { return F(); }

int main()
{
    F v = f(g(h()));
    return 0;
}

On MSVC 2015 it's output is exactly what I expected:

H ctor
G ctor
H dtor
F ctor
G dtor
F dtor

But if you comment out copy constructors it looks like this:

H ctor
G ctor
H dtor
F ctor
G dtor
G dtor
H dtor
F dtor

I suspect that removing user-provided copy constructor causes compiler to generate move-constructor, which in turn causes unnecessary 'move' which doesn't go away no matter how big objects in question are (try adding 1MB array as member variable). I.e. compiler prefers 'move' so much that it chooses it over not doing anything at all.

It seems like a bug in MSVC, but I would really like someone to explain (and/or justify) what is going on here. This is question #1.

Now, if you try GCC 5.4.0 output simply doesn't make any sense:

H ctor
G ctor
F ctor
G dtor
H dtor
F dtor

H has to be destroyed before F is created! H is local to g()'s scope! Note that playing with constructors has zero effect on GCC here.

Same as with MSVC -- looks like a bug to me, but can someone explain/justify what is going on here? That is question #2.

It is really silly that after many years of working with C++ professionally I run into issues like this... After almost 4 decades compilers still can't agree on how to pass values around?

C.M.
  • 3,071
  • 1
  • 14
  • 33
  • just in case -- I know what RVO is... and I know C++ extremely well. And yet I can't find a good answer to these two questions – C.M. Oct 01 '16 at 06:33
  • 2
    What optimization levels are you using? "Compilers can't agree", copy-constructor ellision is an *optimization* - so whether it happens or not is a QOI issue (and the correctness of your program must not depend on it). – Martin Bonner supports Monica Oct 01 '16 at 07:12
  • Optimization lvls have no effect. Can't find anything in standard regarding cctor/mctor elision that would explain whats going on here. Note that we are talking about passing *rvalues* of proper type as arguments to another function -- I kinda expected not having any extra copies (or moves) here... – C.M. Oct 01 '16 at 17:22
  • 1
    This link may help you figure out what is going on: http://en.cppreference.com/w/cpp/language/copy_elision Good luck – Ahmad Siavashi Oct 02 '16 at 22:15
  • http://stackoverflow.com/questions/12953127/what-are-copy-elision-and-return-value-optimization – Ahmad Siavashi Oct 02 '16 at 22:51

3 Answers3

4

For passing a parameter by value, the parameter is a local variable to the function, and it's initialized from the corresponding argument to the function call.

When returning by value, there is a value called the return value. This is initialized by the "argument" to the return expression. Its lifetime is until the end of the full-expression containing the function call.

Also there is an optimization called copy elision which can apply in a few cases. Two of those cases apply to returning by value:

  • If the return value is initialized by another object of the same type, then the same memory location can be used for both objects, and the copy/move step skipped (there are some conditions on exactly when this is allowed or disallowed)
  • If the calling code uses the return value to initialize an object of the same type, then the same memory location can be used for both the return value and the destination object, and the copy/move step is skipped. (Here the "object of the same type" includes function parameters).

It is possible for both of these to apply simultaneously. Also, as of C++14, copy elision is optional for the compiler.

In your call f(g(h())), here is the list of objects (without copy elision):

  1. H default-constructed by return H();
  2. H, the return value of h(), is copy-constructed from (step 1).
  3. ~H (step 1)
  4. H, the parameter of g, is copy-constructed from (step 2).
  5. G default-constructed by return G();
  6. G, the return value of g(), is copy-constructed from (step 5).
  7. ~G (step 5)
  8. ~H (step 4) (see below)
  9. G, the parameter of f, is copy-constructed from (step 6).
  10. F default-constructed by return F();
  11. F, the return value of f(), is move-constructed from (step 10).
  12. ~F (step 10)
  13. ~G (step 9) (see below)
  14. F v is move-constructed from (step 11).
  15. ~F, ~G, ~H (steps 2, 6, 11) are destroyed - I think there is no required ordering of the three
  16. ~F(step 14)

For copy elision, steps 1+2+3 can be combined into "Return value of h() is default-constructed". Similarly for 5+6+7 and 10+11+12. However it is also possible to combine either 2+4 on their own into "Parameter of g is copy-constructed from 1", and also possible for both of these elisions to apply simultaneously , giving "Parameter of g is default-constructed".

Because copy elision is optional you may see different results from different compilers. It doesn't mean there is a compiler bug. You'll be glad to hear that in C++17 some copy elision scenarios are being made mandatory.

Your output in the second MSVC case would be more instructive if you included output text for the move-constructor. I would guess that in the first MSVC case it performed both simultaneous elisions that I mentioned above, whereas the second case omits the "2+4" and "6+9" elisions.

below: gcc and clang delay destruction of function parameters until the end of the full-expression that enclosed the function call. This is why your gcc output differs from MSVC.

As of the C++17 drafting process, it is implementation-defined whether these destructions occur where I had them in my list, or at the end of the full-expression. It could be argued that it was insufficiently specified in the earlier published standards. See here for further discussion.

Community
  • 1
  • 1
M.M
  • 138,810
  • 21
  • 208
  • 365
  • Somehow I missed your edit... Anyways, I disagree on "applying retroactively" -- behavior is well-defined by C++11 standard. MSVC is non-compliant. But I am glad standard is gonna be changed to allow MSVC implementation -- correct move. Next correct move would be to forbid any other implementations (since copy elision becomes mandatory). – C.M. Oct 14 '16 at 12:16
  • @C.M. Defect Reports are considered to apply retroactively to the document that the defect was filed against. A more famous case is that the C++11 text specifies `A a; A const& b {a};` should copy-construct a temporary that `b` binds to, instead of binding directly. – M.M Oct 14 '16 at 23:49
  • Where did you get this info? I never heard about this -- does it mean that it applies to C++98 too? – C.M. Oct 15 '16 at 07:17
2

This behavior is because of an optimization technique called copy elision. In a nutshell all of outputs you mentioned are valid! Yep! Because this technique is (the only one) allowed to modify the behavior of the program. More information can be found at What are copy elision and return value optimization?

Community
  • 1
  • 1
Ahmad Siavashi
  • 979
  • 1
  • 12
  • 29
0

Both M.M's and Ahmad's answers were sending me in right direction, but they both weren't fully correct. So I opted to write down a proper answer below...

  • function call and return in C++ has following semantic:
    • value passed as function argument gets copied into function scope and function gets invoked
    • return value gets copied into caller's scope, gets destroyed (when we reach end of return full expression) and execution leaves function scope

When it comes to implementing this on IA-32-like architecture it becomes painfully obvious that these copies are not required -- it is trivial to allocate uninitialized space on stack (for return value) and define function calling conventions in such way that it knows where to construct return value.

Same for argument passing -- if we pass rvalue as function argument, compiler can direct creation of that rvalue in such way that it will be created right were (subsequently called) function expects it to be.

I imagine this is main reason why copy elision was introduced to standard (and is made mandatory in C++17).

I am familiar with copy elision in general and read this page before. Unfortunately I missed two things:

  1. the fact that this also applies to initialization of function arguments with rvalue (C++11 12.8.p32):

when a temporary class object that has not been bound to a reference (12.2) would be copied/moved to a class object with the same cv-unqualified type, the copy/move operation can be omitted by constructing the temporary object directly into the target of the omitted copy/move

  1. when copy elision kicks in it affects object lifetime in a very peculiar way:

When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the copy/move constructor and/or destructor for the object have side effects. In such cases, the implementation treats the source and target of the omitted copy/move operation as simply two different ways of referring to the same object, and the destruction of that object occurs at the later of the times when the two objects would have been destroyed without the optimization. This elision of copy/move operations, called copy elision, is permitted in the following circumstances (which may be combined to eliminate multiple copies)

This explains GCC output -- we pass some rvalue into a function, copy elision kicks in and we end up with one object being referred via two different ways and lifetime = longest of all of them (which is a lifetime of temporary in our F v = ...; expression). So, basically, GCC output is completely standard compliant.

Now, this also means that MSVC is not standard compliant! It successfully applied both copy elisions, but resulting object lifetime is too short.

Second MSVC output conforms the standard -- it applied RVO, but decided to not apply copy elision for function parameter. I still think it is a bug in MSVC, even though code is ok from standard point of view.

Thank you both M.M and Ahmad for pushing me in right direction.

Now little rant about lifetime rule enforced by standard -- I think it was meant to be used only with RVO.

Alas it doesn't make a lot of sense when applied to eliding copy of function argument. In fact, combined with C++17 mandatory copy elision rule it permits crazy code like this:

T bar();
T* foo(T a) { return &a; }

auto v = foo(bar())->my_method();

this rule forces T to be destroyed only at the end of full expression. This code will become correct in C++17. It is ugly and should not be allowed in my opinion. Plus, you'll end up destroying these objects on caller side (instead of inside of a function) -- needlessly increasing code size and complicating process of figuring out if given function is a nothrow or not.

In other words, I personally prefer MSVC output #1 (as most 'natural'). Both MSVC output #2 and GCC output should be banned. I wonder if this idea can be sold to C++ standardization committee...

edit: apparently in C++17 lifetime of temporary will become 'unspecified' thus allowing MSVC's behavior. Yet another unnecessary dark corner in the language. They should have simply mandated MSVC's behavior.

C.M.
  • 3,071
  • 1
  • 14
  • 33
  • It's lifetime of function parameters that is in question (not lifetime of temporary), and it will be implementation-defined in C++17 (not unspecified). Your example with `&a`, it will be implementation-defined whether or not that code is correct. (So this construct should not appear in portable code). Also, earlier on, you say MSVC is not standard-compliant but actually it is (it destroys the parameters immediately on returning) – M.M Oct 15 '16 at 01:01
  • @M.M MSVC case #1 is not standard-compliant in C++11 (destroys object too early). Both temporary and function parameter are the same object per definition of copy-elision. Yes, my example with '&a' will exhibit 'unspecified' behavior in C++17 (unless they change errata). – C.M. Oct 15 '16 at 07:15
  • MSVC case 1 is standard-complaint in C++11. The function parameter is not a temporary. No, your example with `&a` will not exhibit unspecified behaviour. It is implementation-defined behaviour. – M.M Oct 15 '16 at 09:03
  • _the implementation treats the source and target of the omitted copy/move operation as simply two different ways of referring to the **same object**_ -- what is not clear in this quote from C++11 standard? No, case 1 is not standard compliant because object in question (as clearly stipulated by C++11 standard) is supposed to live until end of full expression (that is unless that defect report in C++17 applies to both C++14 and C++11, which I highly doubt). – C.M. Oct 15 '16 at 18:15