58

I have a very basic question in C++. How to avoid copy when returning an object ?

Here is an example :

std::vector<unsigned int> test(const unsigned int n)
{
    std::vector<unsigned int> x;
    for (unsigned int i = 0; i < n; ++i) {
        x.push_back(i);
    }
    return x;
}

As I understand how C++ works, this function will create 2 vectors : the local one (x), and the copy of x which will be returned. Is there a way to avoid the copy ? (and I don't want to return a pointer to an object, but the object itself)


What would be the syntax of that function using "move semantics" (which was stated in the comments)?

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
Vincent
  • 57,703
  • 61
  • 205
  • 388
  • move semantics: http://www2.research.att.com/~bs/C++0xFAQ.html#rval – chris May 07 '12 at 04:34
  • It won't necessarily create a copy. NRVO or move symantics can prevent that. – Vaughn Cato May 07 '12 at 04:35
  • 1
    You can rely on your compiler for performing NRVO magic or explictly use move semantics. – Alok Save May 07 '12 at 04:36
  • The "copy of x which will be returned" may be constructed by moving from x, or its construction elided to become the same object as x. The semantics of the language already avoid any copies. – Mankarse May 07 '12 at 04:42
  • 1
    In response to your edit - you do not need to change the syntax at all. Anything which is eligible for copy elision must use move construction (if the construction is not elided altogether). – Mankarse May 07 '12 at 04:43
  • NRVO or not, what if the return type cannot be copied (i.e. it causes a compile error)? – Cillié Malan Nov 26 '15 at 21:16

7 Answers7

64

There seems to be some confusion as to how the RVO (Return Value Optimization) works.

A simple example:

#include <iostream>

struct A {
    int a;
    int b;
    int c;
    int d;
};

A create(int i) {
    A a = {i, i+1, i+2, i+3 };
    std::cout << &a << "\n";
    return a;
}

int main(int argc, char*[]) {
    A a = create(argc);
    std::cout << &a << "\n";
}

And its output at ideone:

0xbf928684
0xbf928684

Surprising ?

Actually, that is the effect of RVO: the object to be returned is constructed directly in place in the caller.

How ?

Traditionally, the caller (main here) will reserve some space on the stack for the return value: the return slot; the callee (create here) is passed (somehow) the address of the return slot to copy its return value into. The callee then allocate its own space for the local variable in which it builds the result, like for any other local variable, and then copies it into the return slot upon the return statement.

RVO is triggered when the compiler deduces from the code that the variable can be constructed directly into the return slot with equivalent semantics (the as-if rule).

Note that this is such a common optimization that it is explicitly white-listed by the Standard and the compiler does not have to worry about possible side-effects of the copy (or move) constructor.

When ?

The compiler is most likely to use simple rules, such as:

// 1. works
A unnamed() { return {1, 2, 3, 4}; }

// 2. works
A unique_named() {
    A a = {1, 2, 3, 4};
    return a;
}

// 3. works
A mixed_unnamed_named(bool b) {
    if (b) { return {1, 2, 3, 4}; }

    A a = {1, 2, 3, 4};
    return a;
}

// 4. does not work
A mixed_named_unnamed(bool b) {
    A a = {1, 2, 3, 4};

    if (b) { return {4, 3, 2, 1}; }

    return a;
}

In the latter case (4), the optimization cannot be applied when A is returned because the compiler cannot build a in the return slot, as it may need it for something else (depending on the boolean condition b).

A simple rule of thumb is thus that:

RVO should be applied if no other candidate for the return slot has been declared prior to the return statement.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • +1 for pointing out that we can avoid not only copying, but construction of redundant object and redundant assignment as well ;) – LihO May 07 '12 at 09:58
  • Re case (4), it depends on the compiler (how smart it is) and the specifics of the code. For example, with the concrete code you show a smart compiler can note that the initialization of `a` has no side effects, and that that declaration can be moved down under the `if`. – Cheers and hth. - Alf May 09 '12 at 21:05
  • @Cheersandhth.-Alf: Exact, the as-if rule still stands obviously. In the general case though (out-of-line constructor), this would only be deductible with LTO enabled. – Matthieu M. May 10 '12 at 06:11
  • @MatthieuM. Does it still works if an exception is thrown ? – naab Feb 24 '16 at 20:55
  • @naab: I am not quite sure I see what the problem could be? All local variables prior to the exception being thrown must be destructed, no matter where they are located (in the function frame or in the caller's function frame). So I don't see why throwing would have an impact. – Matthieu M. Feb 25 '16 at 07:17
  • @MatthieuM. Thanks for these nice examples. When I ran your first example, I am getting different addresses. I am using g++ v5.4 on Ubuntu. Any idea why? – Abid Rahman K Apr 02 '17 at 11:26
  • 1
    @AbidRahmanK: Are you compiling with optimizations on (`-O2` or `-O3`)? As noted, this is an optimization, so in debug builds (`-O0`) it is unlikely to appear; also, as all optimizations, the compiler might sometimes NOT apply it, for reasons that can be quite difficult to divine. – Matthieu M. Apr 02 '17 at 13:07
  • I am just using `g++ -std=c++11 filename.cpp`. I think it is -O0 by default, right? After your comment, I tried with -O2 & -O3, but they are still different. In another post, I read that we have to specify the copy constructor manually for this to work. – Abid Rahman K Apr 02 '17 at 14:16
  • @AbidRahmanK: No, whether the copy constructor is specified or not should have no incidence. After all, in the link I give it is not specified and yet the optimization applied. I suggest you open a question with your example, compiler version and command line and ask why the optimization does not apply (you may also reference this question/answer); there may be a difference, or maybe changes in gcc. – Matthieu M. Apr 02 '17 at 15:48
  • @MatthieuM. : Sure, thanks. I will give the link once I post. – Abid Rahman K Apr 03 '17 at 14:06
  • "... constructed directly into the return slot with equivalent semantics (the as-if rule)" afaik this actually is an exception from the as-if rule, because it is allowed to skip the copy, even if the copying would have side-effects – 463035818_is_not_an_ai Jun 12 '17 at 11:43
  • Can confirm what @MatthieuM. said, that running without optimizations *can* indeed result in different memory addresses still. I was running my program in debug from within Visual Studio and kept seeing different memory addresses inside and outside my `create` function. When I ran in release (so with optimizations), I began seeing the same memory address. – Alexander Terp Jul 13 '22 at 21:45
  • Then, how about `operator=()`? `A a; a=create(10);` – Zhang Jul 02 '23 at 09:48
  • 1
    @Zhang: RVO is strictly about the callee, and regardless of the caller. Supposing that `create(10)` uses RVO, then you have RVO, but you'll still have two objects in the snippet of code you show: `a` (to be assigned to) and a temporary (which `create(10)` can RVO into). – Matthieu M. Jul 02 '23 at 10:26
  • @MatthieuM. I just tried, move constructor would be called (if therer is a move constructor for the object), not totally free. I wonder if I should stick to the old pass-reference practice. Maybe it's overkilled. – Zhang Jul 02 '23 at 10:49
  • 1
    @Zhang: If you pass an out-reference, won't you have to call the move-constructor inside `create` _anyway_? Avoid creating a default object in the first place, that's the crux of the issue here, and if you can't... out-reference or RVO will have the same cost anyway. – Matthieu M. Jul 02 '23 at 11:15
27

This program can take advantage of named return value optimization (NRVO). See here: http://en.wikipedia.org/wiki/Copy_elision

In C++11 there are move constructors and assignment which are also cheap. You can read a tutorial here: http://thbecker.net/articles/rvalue_references/section_01.html

Pubby
  • 51,882
  • 13
  • 139
  • 180
  • 3
    Should be noted that not all compilers will do this and even those that do won't all the time. It may still be worth looking into IFF the object is big, you're noting copies, and profiling shows it to be a significant bottleneck. – Edward Strange May 07 '12 at 07:12
15

Named Return Value Optimization will do the job for you since the compiler tries to eliminate redundant Copy constructor and Destructor calls while using it.

std::vector<unsigned int> test(const unsigned int n){
    std::vector<unsigned int> x;
    return x;
}
...
std::vector<unsigned int> y;
y = test(10);

with return value optimization:

  1. y is created
  2. x is created
  3. x is assigned into y
  4. x is destructed

(in case you want to try it yourself for deeper understanding, look at this example of mine)

or even better, just like Matthieu M. pointed out, if you call test within the same line where y is declared, you can also avoid construction of redundant object and redundant assignment as well (x will be constructed within memory where y will be stored):

std::vector<unsigned int> y = test(10);

check his answer for better understanding of that situation (you will also find out that this kind of optimization can not always be applied).

OR you could modify your code to pass the reference of vector to your function, which would be semantically more correct while avoiding copying:

void test(std::vector<unsigned int>& x){
    // use x.size() instead of n
    // do something with x...
}
...
std::vector<unsigned int> y;
test(y);
Community
  • 1
  • 1
LihO
  • 41,190
  • 11
  • 99
  • 167
  • Ah I see. Indeed I had ignored the fact that you first constructed a default `y` before assigning to it. Your sequence of events is right in this case, though I would recommend you just directly initialize `y` to avoid creating two objects where one would suffice. Sorry for the noise. – Matthieu M. May 07 '12 at 09:48
  • 1
    @MatthieuM.: I appreciate your point though. Check my answer now :) – LihO May 07 '12 at 09:56
2

Compilers often can optimize away the extra copy for you (this is known as return value optimization). See https://isocpp.org/wiki/faq/ctors#return-by-value-optimization

jamesdlin
  • 81,374
  • 13
  • 159
  • 204
2

The move constructor is guaranteed to be used if NRVO does not happen

Therefore, if you return an object with move constructor (such as std::vector) by value, it is guaranteed not to do a full vector copy, even if the compiler fails to do the optional NRVO optimization.

This is mentioned by two users who appear influential in the C++ specification itself:

Not satisfied by my Appeal to Celebrity?

OK. I can't fully understand the C++ standard, but I can understand the examples it has! ;-)

Quoting the C++17 n4659 standard draft 15.8.3 [class.copy.elision] "Copy/move elision"

3 In the following copy-initialization contexts, a move operation might be used instead of a copy operation:

  • (3.1) — If the expression in a return statement (9.6.3) is a (possibly parenthesized) id-expression that names an object with automatic storage duration declared in the body or parameter-declaration-clause of the innermost enclosing function or lambda-expression, or
  • (3.2) — if the operand of a throw-expression (8.17) is the name of a non-volatile automatic object (other than a function or catch-clause parameter) whose scope does not extend beyond the end of the innermost enclosing try-block (if there is one),

overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue. If the first overload resolution fails or was not performed, or if the type of the first parameter of the selected constructor is not an rvalue reference to the object’s type (possibly cv-qualified), overload resolution is performed again, considering the object as an lvalue. [ Note: This two-stage overload resolution must be performed regardless of whether copy elision will occur. It determines the constructor to be called if elision is not performed, and the selected constructor must be accessible even if the call is elided. — end note ]

4 [ Example:

class Thing {
public:
  Thing();
  ~ Thing();
  Thing(Thing&&);
private:
  Thing(const Thing&);
};

Thing f(bool b) {
  Thing t;
  if (b)
    throw t;          // OK: Thing(Thing&&) used (or elided) to throw t
  return t;           // OK: Thing(Thing&&) used (or elided) to return t
}

Thing t2 = f(false);  // OK: no extra copy/move performed, t2 constructed by call to f

struct Weird {
  Weird();
  Weird(Weird&);
};

Weird g() {
  Weird w;
  return w;           // OK: first overload resolution fails, second overload resolution selects Weird(Weird&)
}

— end example

I don't like the "might be used" wording, but I think the intent is to mean that if either "3.1" or "3.2" hold, then the rvalue return must happen.

This is pretty clear on the code comments for me.

Pass by reference + std::vector.resize(0) for multiple calls

If you are making multiple calls to test, I believe that this would be slightly more efficient as it saves a few malloc() calls + relocation copies when the vector doubles in size:

void test(const unsigned int n, std::vector<int>& x) {
    x.resize(0);
    x.reserve(n);
    for (unsigned int i = 0; i < n; ++i) {
        x.push_back(i);
    }
}

std::vector<int> x;
test(10, x);
test(20, x);
test(10, x);

given that https://en.cppreference.com/w/cpp/container/vector/resize says:

Vector capacity is never reduced when resizing to smaller size because that would invalidate all iterators, rather than only the ones that would be invalidated by the equivalent sequence of pop_back() calls.

and I don't think compilers are able to optimize the return by value version to prevent the extra mallocs.

On the other hand, this:

  • makes the interface uglier
  • uses more memory than needed when you decrease the vector size

so there is a trade-off.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
1

Referencing it would work.

Void(vector<> &x) {

}
arrowd
  • 33,231
  • 8
  • 79
  • 110
Jake Runzer
  • 957
  • 3
  • 17
  • 35
-6

First of all, you could declare your return type to be std::vector & in which case a reference will be returned instead of a copy.

You could also define a pointer, build a pointer inside your method body and then return that pointer (or a copy of that pointer to be correct).

Finally, many C++ compilers may do return value optimization (http://en.wikipedia.org/wiki/Return_value_optimization) eliminating the temporary object in some cases.