1

I need to get this straight. With the code below here:

vector<unsigned long long int> getAllNumbersInString(string line){
    vector<unsigned long long int> v;   
    string word;
    stringstream stream(line);
    unsigned long long int num;

    while(getline(stream, word, ',')){
    num = atol(word.c_str());
    v.push_back(num);
    }

    return v;
}

This sample code simply turns an input string into a series of unsigned long long int stored in vector.

In this case above, if I have another function calls this function, and we appear to have about 100,000 elements in the vector, does this mean, when we return it, a new vector will be created and will have elements created identically to the one in the function, and then the original vector in the function will be eliminated upon returning? Is my understanding correct so far?

Normally, I will write the code in such a way that all functions will return pointer when it comes to containers, however, program design-wise, and with my understanding above, should we always return a pointer when it comes to container?

Karl
  • 5,613
  • 13
  • 73
  • 107
  • 2
    *"should we always return a pointer when it comes to container?"* - Definitely not. By all means don't introduce dynamic allocation just because of efficiency fears, and even less so in C++11. Either you need dynamic lifetime or not. Even in C++03 a reference argument is leagues better than returning a dynamically allocated object, if you don't need dynamic allocation otherwise, and in C++11 the whole discussion is moot anyway. – Christian Rau Feb 09 '13 at 15:45

4 Answers4

8

The std::vector will most likely (if your compiler optimizations are turned on) be constructed directly in the function's return value. This is known as copy/move elision and is an optimization the compiler is allowed to make:

in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function’s return value

This quote is taken from the C++11 standard but is similar for C++03. It is important to note that copy/move elision does not have to occur at all - it is entirely up to the compiler. Most modern compilers will handle your example with no problems at all.

If elision does not occur, C++11 will still provide you with a further benefit over C++03:

  • In C++03, without copy elision, returning a std::vector like this would have involved, as you say, copying all of the elements over to the returned object and then destroyed the local std::vector.

  • In C++11, the std::vector will be moved out of the function. Moving allows the returned std::vector to steal the contents of the std::vector that is about to be destroyed. This is much more efficient that copying the contents over.

    You may have expected that the object would just be copied because it is an lvalue, but there is a special rule that makes copies like this first be considered as moves:

    When the criteria for elision of a copy operation are met [...] and the object to be copied is designated by an lvalue, overload resolution to select the constructor for the copy is first performed as if the object were designated by an rvalue.

As for whether you should return a pointer to your container: the answer is almost certainly no. You shouldn't be passing around pointers unless its completely necessary, and when it is necessary, you're much better off using smart pointers. As we've seen, in your case it's not necessary at all because there's little to no overhead in passing it by value.

Joseph Mansfield
  • 108,238
  • 20
  • 242
  • 324
  • Copy elision would trump the move in C++11. – juanchopanza Feb 09 '13 at 15:38
  • @juanchopanza I said that! :P – Joseph Mansfield Feb 09 '13 at 15:38
  • +1 - I'm shocked so many answers miss to mention move semantics and concetrate on the valid but implementation-defined argument of RVO alone. – Christian Rau Feb 09 '13 at 15:48
  • @ChristianRau: Because in this case even in C++11 copy elision will remove the need for copy or the move. – Alok Save Feb 09 '13 at 15:51
  • @ChristianRau one could argue that move semantics just confuse the issue. Plus, moving requires types that are movable (granted, `std::vector` is one). – juanchopanza Feb 09 '13 at 15:53
  • @AlokSave Yes, I know. But having optimized code **guaranteed by-standard** is still much better than relying on a **probable but not-guaranteed** optimization. And not mentioning move at all in this case is a sin, being a perfect fit for this question ;) – Christian Rau Feb 09 '13 at 15:53
  • Yes, I've tried to make it clear that *even when elision does not occur*, moving is a further optimization permitted by C++11. – Joseph Mansfield Feb 09 '13 at 15:54
  • @ChristianRau: Well the Q is not tagged with C++11 to begin with. Which eventhough is the present and current standard is not widely used with many of the existing code bases. – Alok Save Feb 09 '13 at 15:54
  • @AlokSave Don't we take [tag:C++] to mean the latest C++ standard? – Joseph Mansfield Feb 09 '13 at 15:55
  • @AlokSave Yes, but in the end *C++11* is, well, *C++* too. Of course just mentioning move-semantics wihtout disambiguating between *C++11* and *C++98* wouldn't be a good idea either, but ignoring the current standard (and move-semantics are indeed supported by all major compilers) is not an option. – Christian Rau Feb 09 '13 at 15:55
  • @ChristianRau I agree that move semantics are definitely worth mentioning here. I have added a note to my answer. – juanchopanza Feb 09 '13 at 15:57
  • @ChristianRau: As I understand C++ tag in SO means C++03, since there is a specific C++11 tag. If not the C++11 should be done with already. – Alok Save Feb 09 '13 at 15:57
  • @AlokSave That in turn is a misunderstanding. `c++` doesn't neccessarily mean the latest standard, but it doesn't just mean some old standard either. It means the entirety of the language. And in the end if the OP is not aware of *C++11* and its features at all (which seems the case here), then we cannot require him to tag it `c++11` and mentioning that there is a difference is even more important. – Christian Rau Feb 09 '13 at 15:58
  • @AlokSave: No, it doesn't mean that, and shouldn't, since most beginners probably don't know that there was an update to the standard in 2011, and thus wouldn't know to tag their question with it, even though their compiler supports it. – Benjamin Lindley Feb 09 '13 at 16:00
  • I suppose the rule should really be that [tag:c++] refers to the language as a whole, which at the moment is in a transition phase. If somebody asks a [tag:c++], if the answers are vastly different between C++03 and C++11, both should be accounted for. If somebody wants an answer for a specific standard, they should tag correspondingly. – Joseph Mansfield Feb 09 '13 at 16:01
  • @BenjaminLindley: That is open for interpretation. That is your perception of the situation which may or maynot be true, the point is why does the C++11 tag still exist? It quashes the whole idea of C++11 is C++ which is true in the sense that C++11 is the current standard but doesn't hold good when you have separate tags in a programming forum. – Alok Save Feb 09 '13 at 16:09
  • @AlokSave `c++11` is there for when you need to ask a question specific to *C++11*, istead of a question regarding *C++* in general. I agree that C++11 alone *is not* C++ (at the moment at least), but neither is *C++03*. I don't know how this could be open for interpretation: `c++11` is about C++11 and `c++` is about C++, in the end the question is not tagged `c++03`. Who said tags need to be mutually exclusive? – Christian Rau Feb 09 '13 at 16:15
  • @ChristianRau: Who said they are not? Hopefully, this will remove out the ambiguity.[Need for rearrangement of C++ tags in SO main](http://meta.stackexchange.com/questions/166933/need-for-rearrangement-of-c-tags-in-so-main) – Alok Save Feb 09 '13 at 16:25
  • @AlokSave Thanks for posting this meta question. Fortunately the current answers agree with me. – Christian Rau Feb 09 '13 at 17:05
4

It is safe, and I would say preferable, to return by value with any reasonable compiler. The C++ standard allows copy elision, in this case named return value optimization (NRVO), which means this extra copy you are worried about doesn't take place.

Note that this is a case of an optimization that is allowed to modify the observable behaviour of a program.

Note 2. As has been mentioned in other answers, C++11 introduces move semantics, which means that, in cases where RVO doesn't apply, you may still have a very cheap operation where the contents of the object being returned are transfered to the caller. In the case of std::vector, this is extremely cheap. But bear in mind that not all types can be moved.

juanchopanza
  • 223,364
  • 34
  • 402
  • 480
2

Your understanding is correct.
But compilers can apply copy elision through RVO and NRVO and remove the extra copy being generated.

Should we always return a pointer when it comes to container?

If you can, ofcourse you should avoid retun by value especially for non POD types.

Alok Save
  • 202,538
  • 53
  • 430
  • 533
2

That depends on whether or not you need reference semantics.

In general, if you do not need reference semantics, I would say you should not use a pointer, because in C++11 container classes support move semantics, so returning a collection by value is fast. Also, the compiler can elide the call to the moved constructor (this is called Named Return Value Optimization or NRVO), so that no overhead at all will be introduced.

However, if you do need to create separate, consistent views of your collection (i.e. aliases), so that for instance insertions into the returned vector will be "seen" in several places that share the ownership of that vector, then you should consider returning a smart pointer.

Community
  • 1
  • 1
Andy Prowl
  • 124,023
  • 23
  • 387
  • 451