Both implied copies of the vector
can - and often are - eliminated. The named return value optimization can eliminate the copy implied in the return statement return out;
and it is allowed the the for the temporary implied in the copy initialization of oof
to be eliminated as well.
With both optimizations in play the object constructed in vector<foo> out;
is the same object as oof
.
It's easier to test which of these optimizations are being performed with an artificial test case such as this.
struct CopyMe
{
CopyMe();
CopyMe(const CopyMe& x);
CopyMe& operator=(const CopyMe& x);
char data[1024]; // give it some bulk
};
void Mutate(CopyMe&);
CopyMe fn()
{
CopyMe x;
Mutate(x);
return x;
}
int main()
{
CopyMe y = fn();
return 0;
}
The copy constructor is declared but not defined so that calls to it can't be inlined and eliminated. Compiling with a now comparatively old gcc 4.4 gives the following assembly at -O3 -fno-inline
(filtered to demangle C++ names and edited to remove non-code).
fn():
pushq %rbx
movq %rdi, %rbx
call CopyMe::CopyMe()
movq %rbx, %rdi
call Mutate(CopyMe&)
movq %rbx, %rax
popq %rbx
ret
main:
subq $1032, %rsp
movq %rsp, %rdi
call fn()
xorl %eax, %eax
addq $1032, %rsp
ret
As can be seen there are no calls to the copy constructor. In fact, gcc performs these optimizations even at -O0
. You have to provide the -fno-elide-constructors
to turn this behaviour off; if you do this then gcc generates two calls to the copy constructor of CopyMe
- one inside and one outside of the call to fn()
.
fn():
movq %rbx, -16(%rsp)
movq %rbp, -8(%rsp)
subq $1048, %rsp
movq %rdi, %rbx
movq %rsp, %rdi
call CopyMe::CopyMe()
movq %rsp, %rdi
call Mutate(CopyMe&)
movq %rsp, %rsi
movq %rbx, %rdi
call CopyMe::CopyMe(CopyMe const&)
movq %rbx, %rax
movq 1040(%rsp), %rbp
movq 1032(%rsp), %rbx
addq $1048, %rsp
ret
main:
pushq %rbx
subq $2048, %rsp
movq %rsp, %rdi
call fn()
leaq 1024(%rsp), %rdi
movq %rsp, %rsi
call CopyMe::CopyMe(CopyMe const&)
xorl %eax, %eax
addq $2048, %rsp
popq %rbx
ret