I've asked a few questions which have touched around this issue, but I've been getting differing responses, so I thought best to ask it directly.
Lets say we have the following code:
// Silly examples of A and B, don't take so seriously,
// just keep in mind they're big and not dynamically allocated.
struct A { int x[1000]; A() { for (int i = 0; i != 1000; ++i) { x[i] = i * 2; } };
struct B { int y[1000]; B() { for (int i = 0; i != 1000; ++i) { y[i] = i * 3; } };
struct C
{
A a;
B b;
};
A create_a() { return A(); }
B create_b() { return B(); }
C create_c(A&& a, B&& b)
{
C c;
c.a = std::move(a);
c.b = std::move(b);
return C;
};
int main()
{
C x = create_c(create_a(), create_b());
}
Now ideally create_c(A&&, B&&)
should be a no-op. Instead of the calling convention being for A and B to be created and references to them passed on stack, A and B should created and passed in by value in the place of the return value, c
. With NRVO, this will mean creating and passing them directly into x
, with no further work for the function create_c
to do.
This would avoid the need to create copies of A and B.
Is there any way to allow/encourage/force this behavior from a compiler, or do optimizing compilers generally do this anyway? And will this only work when the compiler inline the functions, or will it work across compilation units.
(How I think this could work across compilation units...)
If create_a()
and create_b()
took a hidden parameter of where to place the return value, they could place the results into x
directly, which is then passed by reference to create_c()
which needs to do nothing and immediately returns.