Can the compiler elide the following copy?

Question

I'm still a rookie programmer, I know that premature optimization is bad, but I also know that copying huge stuff around is bad, as well.

I've read up on copy elision and it's synonyms but the examples on Wikipedia for example make it seem to me that copy elision can only take place if the object to be returned gets returned at the same time it gets completely constructed.

What about objects like vectors, which usually only make sense when filled with something, when used as a return value. After all, an empty vector could just be instantiated manually.

So, does it also work in a case like this?

bad style for brevity:

vector<foo> bar(string baz)
{
    vector<foo> out;
    for (each letter in baz)
        out.push_back(someTable[letter]);

    return out;
}

int main()
{
     vector<foo> oof = bar("Hello World");
}

I have no real trouble using bar(vector & out, string text), but the above way would look so much better, aesthetically, and for intent.

It can be elided. Note however, that the standard does _still_ require the copy constructor to be accessible (e.g. non-private) — sehe, May 26 '11 at 13:28

score 10 · Accepted Answer · edited May 26 '11 at 14:11

10

the examples on wikipedia for example make it seem to me that copy elision can only take place if the object to be returned gets returned at the same time it gets completely constructed.

That is misleading (read: wrong). The issue is rather that only one object is returned in all code paths, i.e. that only one construction for the potential return object is happening.

Your code is fine, any modern compiler can elide the copy.

On the other hand, the following code could potentially generate problems:

vector<int> foo() {
    vector<int> a;
    vector<int> b;
    // … fill both.
    bool c;
    std::cin >> c;
    if (c) return a; else return b;
}

Here, the compiler needs to fully construct two distinct objects, and only later decides which of them are returned, hence it has to copy once because it cannot directly construct the returned object in the target memory location.

edited May 26 '11 at 14:11

MSalters

173,980
10
155
350

answered May 26 '11 at 13:27

Konrad Rudolph

530,221
131
937
1,214

Your counterexample is actually explicitly exempted from copy-elision by the same paragraph I quote in my answer (but another part of it). The elision is only permitted if the expression in the return-statement is the name of a class-object. – Björn Pollex May 26 '11 at 13:33
may the compiler decide to _move_ the returned vector? – user396672 May 26 '11 at 13:39
@user396672 I’m not firm in C++0x but logic dictates that this should indeed be possible (since the original isn’t needed any more after the the `return`). – Konrad Rudolph May 26 '11 at 13:50
11

The example is trivially optimized BTW: `if (c) { swap(a,b); } return a;` IOW, don't worry up front whether you need to design everything with RVO in mind. – MSalters May 26 '11 at 14:13

score 5 · Answer 2 · answered May 26 '11 at 13:26

5

There is nothing preventing the compiler from eliding the copy. This is defined in 12.8.15:

[...] This elision of copy operations is permitted in the following circumstances (which may be combined to eliminate multiple copies):

[...]

when a temporary class object that has not been bound to a reference (12.2) would be copied to a class object with the same cv-unqualified type, the copy operation can be omitted by constructing the temporary object directly into the target of the omitted copy

If it actually does depends on the compiler and the settings you use.

answered May 26 '11 at 13:26

Björn Pollex

75,346
28
201
283

that looks to me like a general answer about the topic , as in: "If the compiler were smart enough to detect opportunities to elide the copy, it would do so even if the object to be returned were to be edited all over the returning function.", yes? Edit: Ah, that snippet is informative, shame on me if it was in the articles I read...thanks! – Erius May 26 '11 at 13:30
@Erius: It only depends on two things: If the compiler is smart enough, and if it is allowed to do so. I can only answer the first one, as I don't know your compiler and you settings. – Björn Pollex May 26 '11 at 13:31
Well, it's the (hopefully the latest) MSVC one, full optimizations, but I fully understand the part about the compiler intelligence now, so thanks again. – Erius May 26 '11 at 13:41

CB Bailey · Answer 3 · 2011-05-26T22:00:54.400

Both implied copies of the vector can - and often are - eliminated. The named return value optimization can eliminate the copy implied in the return statement return out; and it is allowed the the for the temporary implied in the copy initialization of oof to be eliminated as well.

With both optimizations in play the object constructed in vector<foo> out; is the same object as oof.

It's easier to test which of these optimizations are being performed with an artificial test case such as this.

struct CopyMe
{
    CopyMe();
    CopyMe(const CopyMe& x);
    CopyMe& operator=(const CopyMe& x);

    char data[1024]; // give it some bulk
};

void Mutate(CopyMe&);

CopyMe fn()
{
    CopyMe x;
    Mutate(x);
    return x;
}

int main()
{
    CopyMe y = fn();
    return 0;
}

The copy constructor is declared but not defined so that calls to it can't be inlined and eliminated. Compiling with a now comparatively old gcc 4.4 gives the following assembly at -O3 -fno-inline (filtered to demangle C++ names and edited to remove non-code).

fn():
        pushq   %rbx
        movq    %rdi, %rbx
        call    CopyMe::CopyMe()
        movq    %rbx, %rdi
        call    Mutate(CopyMe&)
        movq    %rbx, %rax
        popq    %rbx
        ret

main:
        subq    $1032, %rsp
        movq    %rsp, %rdi
        call    fn()
        xorl    %eax, %eax
        addq    $1032, %rsp
        ret

As can be seen there are no calls to the copy constructor. In fact, gcc performs these optimizations even at -O0. You have to provide the -fno-elide-constructors to turn this behaviour off; if you do this then gcc generates two calls to the copy constructor of CopyMe - one inside and one outside of the call to fn().

fn():
        movq    %rbx, -16(%rsp)
        movq    %rbp, -8(%rsp)
        subq    $1048, %rsp
        movq    %rdi, %rbx
        movq    %rsp, %rdi
        call    CopyMe::CopyMe()
        movq    %rsp, %rdi
        call    Mutate(CopyMe&)
        movq    %rsp, %rsi
        movq    %rbx, %rdi
        call    CopyMe::CopyMe(CopyMe const&)
        movq    %rbx, %rax
        movq    1040(%rsp), %rbp
        movq    1032(%rsp), %rbx
        addq    $1048, %rsp
        ret

main:
        pushq   %rbx
        subq    $2048, %rsp
        movq    %rsp, %rdi
        call    fn()
        leaq    1024(%rsp), %rdi
        movq    %rsp, %rsi
        call    CopyMe::CopyMe(CopyMe const&)
        xorl    %eax, %eax
        addq    $2048, %rsp
        popq    %rbx
        ret

Can the compiler elide the following copy?

3 Answers3

Linked