C++ Named Return Value Optimization with nested function calls

Question

I know that NRVO allows a function to construct an object and return that object by value without the cost of a copy or even move operation. It found that it also works with nested function calls, allowing you to construct the object from the return value of another function call.

Please consider the following program and it's output as shown in the comments:
(Output from Visual Studio 2017, version 15.2, release build.)

#include <stdio.h>
class W
{
public:
  W() { printf( "W::W()\n" ); }
  W( const W& ) { printf( "W::W( const W& )\n" ); }
  W( W&& ) { printf( "W::W( W&& )\n" ); }
  W& operator=( const W& ) { printf( "W::operator=( const W& )\n" ); }
  W& operator=( W&& ) { printf( "W::operator=( W&& )\n" ); }
  ~W() { printf( "W::~W()\n" ); }
  void Transform() { printf( "W::Transform()\n" ); }
  void Run() { printf( "W::Run()\n" ); }
};

W make()
{
  W w;
  return w;
}

W transform_make()
{
  W w{ make() };
  w.Transform();
  return w;
}

W transform1( W w )
{
  w.Transform();
  return w;
}

W&& transform2( W&& w )
{
  w.Transform();
  return std::move(w);
}

int main()                         // Program output:
{
  printf( "TestM:\n" );            //TestM:
  {                                //W::W()
    W w{ make() };                 //W::Run()
    w.Run();                       //W::~W()
  }
                                   //TestTM:
  printf( "TestTM:\n" );           //W::W()
  {                                //W::Transform()
    W w{ transform_make() };       //W::Run()
    w.Run();                       //W::~W()
  }
                                   //TestT1:
  printf( "TestT1:\n" );           //W::W()
  {                                //W::Transform()
    W w{ transform1( make() ) };   //W::W( W&& )
    w.Run();                       //W::~W()
  }                                //W::Run()
                                   //W::~W()

  printf( "TestT2:\n" );           //TestT2:
  {                                //W::W()
    W&& w{ transform2( make() ) }; //W::Transform()
    w.Run();                       //W::~W()
  }                                //W::Run()
}

TestM is the normal NRVO case. The object W is constructed and destructed only once. TestTM is the nested NRVO case. Again the object is constructed only once and never copied or moved. So far so good.

Now getting to my question - how can I make TestT1 work with the same efficiency as TestTM? As you can see in TestT1 a second object is move constructed - this is something I would like to avoid. How can I change the function transform1() to avoid any additional copies or moves? If you think about it, TestT1 is not that much different from TestTM, so I have a feeling that this is something that must be possible.

For my second attempt, TestT2, I tried passing the object via RValue reference. This eliminated the extra move constructor, but unfortunately this causes the destructor to be called before I am done with the object, which is not always ideal.

Update:
I also note that it is possible to make it work using references, as long as you make sure not to use the object beyond the end of the statement:

W&& transform2( W&& w )
{
  w.Transform();
  return std::move(w);
}

void run( W&& w )
{
  w.Run();
}

printf( "TestT3:\n" );           //TestT3:
{                                //W::W()
  run( transform2( make() ) );   //W::Transform()
}                                //W::Run()
                                 //W::~W()

Is this safe to do?

Curious · Accepted Answer · 2017-06-10T17:31:52.897

2

This happens in Test1 because the compiler is explicitly disallowed to apply NRVO from by value parameters from a function's argument list. And in Test1 you are accepting a W instance by value as a function parameter, so the compiler cannot elide the move on return.

See Why are by-value parameters excluded from NRVO? and my discussion with Howard Hinnant about the issue here Why does for_each return function by move in the comments

You cannot make Test1 work as efficiently as you did in the earlier case because of this.

The relevant quote from the standard

15.8.3 Copy/move elision [class.copy.elision]

When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, ...

in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function parameter or a variable introduced by the exception-declaration of a handler (18.3)) with the same type (ignoring cv-qualification) as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function call’s return object

edited Jun 10 '17 at 17:31

answered Jun 10 '17 at 17:18

Curious

20,870
8
61
146

Thanks, I think I understand. But why does `TestT2` not work? I thought the reference would have extended the lifetime of the temporary object? – Barnett Jun 10 '17 at 18:23
@Barnett the lifetime of the object bound to a reference is only extended if the object that is being bound is a complete object (for most purposes, if its a prvalue) or a complete subobject of a complete object (for example `auto&& val = Something{}.member_variable`). And temporaries that do not have their lifetime extended last for the entirety of the expression they appear in. So your `&&` parameters and return values in the `TestT2` don't extend the lifetime of any temporaries bound beyond the lifetime of the expression the function call appears in – Curious Jun 10 '17 at 18:36
Also see both the answers here https://stackoverflow.com/questions/42441791/lifetime-extension-prvalues-and-xvalues, they might help in explaining things a bit more – Curious Jun 10 '17 at 18:36
@Curious can the object be somehow passed through a method in a way such as guaranteed RVO is preserved (i.e. prvalue materialization is avoided)? – haelix Jan 07 '19 at 18:09

C++ Named Return Value Optimization with nested function calls

1 Answers1

Linked

Related