C++ vector, return vs. parameter

Question

Possible Duplicate:
how to “return an object” in C++

I am wondering if there is a difference between the three following approaches:

void FillVector_1(vector<int>& v) {
    v.push_back(1); // lots of push_backs!
}

vector<int> FillVector_2() {
    vector<int> v;
    v.push_back(1); // lots of push_backs!
    return v;
}

vector<int> FillVector_3() {
    int tab[SZ] = { 1, 2, 3, /*...*/ };
    return vector<int>(tab, tab + SZ);
}

Just return by value and worry about performance when you feel it's actually a problem. Then profile it so you don't have to guess. I'm preemptively closing as [duplicate](http://stackoverflow.com/questions/3350385/how-to-return-an-object-in-c). — GManNickG, Sep 13 '10 at 18:34
I would not have voted to close this as a duplicate for a couple reasons: (1) The earlier question had specifics about dynamic memory allocation that don't apply to this question, and (2) this question is STL-specific which can invoke other idioms like insert iterators. Also, some implementations of STL already take advantage of C++0x r-value references to minimize copying. That method doesn't apply in the earlier question. — Adrian McCarthy, Sep 13 '10 at 19:16
@GMan: While I agree with the sentiment to not prematurely micro-optimize, I don't agree with the sentiment that one should return large objects by-value as a matter of course. Especially when the STL collections are concerned, there are better ways. For instance, by writing the function to accept an output iterator we can make it more adaptable and more STL-ish, while avoiding having to rely on RVO or funky compiler optimizations that might change the behavior of your program in an undesirable way. Besides, C++ programmers should know when their objects are being copied - not code & pray — John Dibling, Sep 13 '10 at 19:27
@John: It's not code and pray, it's code and get-your-program-working-then-profile-it-to-see-what-can-be-sped-up-and-stop-freaking-guessing. @Adrian @John: The iterator design is indeed probably the best way to code this, but as it stands the question itself is clearly just an unwarranted case of "oh surely I cannot return by value!". But as I like to look beyond the face of the question and solve the "real" problem, you're right; I'm voting to re-open. — GManNickG, Sep 13 '10 at 19:40

score 19 · Accepted Answer · answered Sep 13 '10 at 18:40

The biggest difference is that the first way appends to existing contents, whereas the other two fill an empty vector. :)

I think the keyword you are looking for is return value optimization, which should be rather common (with G++ you'll have to turn it off specifically to prevent it from being applied). That is, if the usage is like:

vector<int> vec = fill_vector();

then there might quite easily be no copies made (and the function is just easier to use).

If you are working with an existing vector

vector<int> vec;
while (something)
{
    vec = fill_vector();
    //do things
}

then using an out parameter would avoid creation of vectors in a loop and copying data around.

+1 for pointing out the difference between initialization and assignment w.r.t. copy elision. — sellibitze, Sep 13 '10 at 18:45

score 9 · Answer 2 · answered Sep 13 '10 at 19:04

9

The idiomatic C++ approach would be to abstract over the container type by using an output iterator:

template<typename OutputIterator>
void FillContainer(OutputIterator it) {
    *it++ = 1;
    ...
}

Then it can be used with vector as:

std::vector<int> v;
FillContainer(std::back_inserter(v));

The performance (and other advantages, such as being able to fill a non-empty container) are the same as for your option #1. Another good thing is that this can be used to output in a streaming mode, where results are immediately processed and discarded without being stored, if the appropriate kind of iterator is used (e.g. ostream_iterator).

answered Sep 13 '10 at 19:04

Pavel Minaev

99,783
25
219
289

+1. This seems like the most "STL way." Note that it might be useful to return the iterator from `FillContainer`. – Adrian McCarthy Sep 13 '10 at 19:19
How does this work when "FillContainer" is a method of a class? – User May 24 '11 at 00:07
@User: same way - class member functions can also be templated. What you can't do with such a thing is make it virtual. – Pavel Minaev May 24 '11 at 16:57

score 6 · Answer 3 · answered Sep 13 '10 at 18:35

6

One would assume parameter is best, but in practice it often is not. This depends on compiler. Some compilers (I think actually most recent compilers) will apply Return Value Optimization - Visual Studio 2005 and later should do it in both cases you have provided (see Named Return Value Optimization in Visual C++ 2005).

The best way to know for sure is to check the disassembly produced.

answered Sep 13 '10 at 18:35

Suma

33,181
16
123
191

This is a _huge_ optimization, and most modern compilers leverage it with big gains in performance. – fbrereto Sep 13 '10 at 18:38
The whole idea of RVO makes me cringe. I hate it. – John Dibling Sep 13 '10 at 19:28

sellibitze · Answer 4 · 2010-09-13T18:44:12.077

Adding a fourth variant into the mix:

void FillVector_4(vector<int>& v) {
   static const int tab[SZ] = {1,2,3, ... };
   v.assign(tab,tab+SZ);
}

If you're thinking about performance version 2 and version 3 may make the compiler create a vector<int> copy for the return value. That is, if the compiler isn't able to do NRVO (named return value optimization). Also, consecutive push_backs without a reserve probably leads to a couple of reallocations since the vector needs to grow. Whether this matters at all depends on your problem you're trying to solve.

You'll be pleased to know that C++0x will make returning a locally created vector very efficient. I also recommend reading David Abrahams article series about efficient value types including passing/returning.

Andy Thomas · Answer 5 · 2010-09-13T18:46:51.017

2

The first definitely does not copy the vector.

The cost of copying the vector could be linear in the number of elements in the vector.

The first introduces no risk of linear behavior on any platform or compiler, and no costs of profiling and refactoring.

edited Sep 13 '10 at 18:46

answered Sep 13 '10 at 18:35

Andy Thomas

84,978
11
107
151

The first also allows to call several functions in a row to fill a single vector. If you need to do the same with #2, you'll have to copy elements from the vector being returned (which means that all the heroic RVO efforts by the compiler will be for nothing, since you'll end up with an immediately-discarded copy anyway). – Pavel Minaev Sep 13 '10 at 19:01

score 0 · Answer 6 · answered Sep 13 '10 at 18:36

0

At a glance, the first two probably have more resizing of the vector going on, whereas the 3rd one probably does not need to resize the vector as it runs. This can be mitigated by resizing it yourself before the pushbacks.

answered Sep 13 '10 at 18:36

Brian

25,523
18
82
173

C++ vector, return vs. parameter

6 Answers6

Linked