19

I will start by saying I've read this topic: C++ Return reference / stack memory. But there, the question was with an std::vector<int> as object-type. But I though the behavior of std::string was different. Wasn't this class especially made for using strings without having to worry about memory-leaks and wrong usage of memory?

So, I already know this is wrong:

std::vector<t> &function()
{
    vector<t> v;
    return v;
}

But is this wrong as well?

std::string &function()
{
    string s = "Faz";
    s += "Far";
    s += "Boo";
    return s;
}

Thanks


Extra question (EDIT): So, am I correct when I say returning (by value) an std::string doesn't copy the char sequence, only a pointer to the char * array and an size_t for the length?

If this statement is correct, is this the valid way to create a deep copy of a string (to avoid manipulating two strings at once)?

string orig = "Baz";
string copy = string(orig);
Martijn Courteaux
  • 67,591
  • 47
  • 198
  • 287
  • In answer to your extra question: no, you've got the wrong idea. Creating a copy of a `std::string` always creates a copy. What the answers to this question rightly point out is that with RVO, it's possible to write the function correctly (returning a value) and _avoid_ ever creating a copy in the first place. See @Martinho's answer below: no copies! – Tom Jun 17 '11 at 16:35

5 Answers5

39

It doesn't matter what the type is; this pattern is always completely, 100% wrong for any object type T:

T& f() {
    T x;
    return x;
}   // x is destroyed here and the returned reference is thus unusable

If you return a reference from a function, you must ensure that the object to which it refers will still exist after the function returns. Since objects with automatic storage duration are destroyed at the end of the block in which they are declared, they are guaranteed not to exist after the function returns.

James McNellis
  • 348,265
  • 75
  • 913
  • 977
  • Thanks for the correct answer to my first, but I accepted the other answer because of Blindy helped me with the extra question. But you surely do have my +1! – Martijn Courteaux Jun 18 '11 at 08:19
32

You are really close to making those functions work:

std::string function()
{
    string s = "Faz";
    s += "Far";
    s += "Boo";
    return s;
}

Simply make them return a copy instead of a reference and you're set. This is what you want, a copy of the stack-based string.

It gets better too, because the return value optimization (RVO) will only create the string once and return it, just like if you had created it on the heap and returned a reference to it, all behind the scenes!

Blindy
  • 65,249
  • 10
  • 91
  • 131
  • 1
    @Blindy: Thanks, I know this is a solution, but I was thinking about performance. – Martijn Courteaux Jun 17 '11 at 16:25
  • @Martijn, RVO makes it as fast as reference-calls, because the return ***is*** a reference behind the scenes. – Blindy Jun 17 '11 at 16:30
  • @Blindy: So, your comment "because the return is a reference behind the scenes." is the answer to my extra question? – Martijn Courteaux Jun 17 '11 at 16:32
  • @Martijn, no, the answer to your extra question is a "yes, but that's unrelated". RVO is only for return values passed by value (non-reference) from functions. There is absolutely no copying involved. `string(otherstring)` does indeed return a deep copy (once you modify it at least), but that uses the copy constructor. – Blindy Jun 17 '11 at 16:37
  • @Blindy: Sorry, my extra question was wrong: I meant *returning* a string ... Is my statement now correct? Thanks, anyway. – Martijn Courteaux Jun 17 '11 at 16:42
  • 3
    @Martijn, again, there is absolutely no copying involved of any kind, be it pointer, reference or native type (`size_t` in your example). None whatsoever. That's the whole point, to avoid any copying for performance reasons. – Blindy Jun 17 '11 at 16:45
10

Don't return references, return by value:

std::string function() // no ref
{
    string s = "Faz";
    s += "Far";
    s += "Boo";
    return s;
}

If your compiler can do named return value optimization, aka NRVO, (which is likely), it will transform this into something roughly equivalent to the following, which avoids any extraneous copies:

// Turn the return value into an output parameter:
void function(std::string& s)
{
    s = "Faz";
    s += "Far";
    s += "Boo";
}

// ... and at the callsite,
// instead of:
std::string x = function();
// It does this something equivalent to this:
std::string x; // allocates x in the caller's stack frame
function(x); // passes x by reference

Regarding the extra question:

The copy constructor of string always does a deep copy. So, if there are copies involved, there are no aliasing issues. But when returning by value with NRVO, as you can see above, no copies are made.

You can make copies using several different syntaxes:

string orig = "Baz";
string copy1 = string(orig);
string copy2(orig);
string copy3 = orig;

The second and third have no semantic difference: they're both just initialization. The first one creates a temporary by calling the copy constructor explicitly, and then initializes the variable with a copy. But a compiler can do copy elision here (and it's very likely that it will) and will make only one copy.

R. Martinho Fernandes
  • 228,013
  • 71
  • 433
  • 510
2

You can take the address of the returned string and compare it with the address of the original string, as shown below:

#include <iostream>    
using namespace std;

string f() {
    string orig = "Baz";
    string copy1 = string(orig);
    string copy2(orig);
    string copy3 = orig;

    cout << "orig addr: " << &orig << endl;
    cout << "copy1 addr: " << &copy1 << endl;
    cout << "copy2 addr: " << &copy2 << endl;
    cout << "copy3 addr: " << &copy3 << endl;
    return orig;
}

int main() {
    string ret = f();
    cout << "ret addr: " << &ret << endl;
}

I got the following:

orig addr: 0x7ffccb085230
copy1 addr: 0x7ffccb0851a0
copy2 addr: 0x7ffccb0851c0
copy3 addr: 0x7ffccb0851e0
ret addr: 0x7ffccb085230

You see orig and ret point to the same string instance in memory, so orig is returned by reference. copy1, copy2, copy3 are copies of orig because they point to different objects in memory.

Pavel
  • 21
  • 1
2

The problem with this (regardless of the type) is that you're returning a reference to memory that goes out of scope oncee the return is hit.

std::string &function()
{
    string s = "Faz";
    s += "Far";
    s += "Boo";

    // s is about to go out scope here and therefore the caller cannot access it
    return s;
}

You would want to change the return type to not be reference but by value, therefore a copy of s gets returned.

std::string function()
{
    string s = "Faz";
    s += "Far";
    s += "Boo";

    // copy of s is returned to caller, which is good
    return s;
}
Scott Saad
  • 17,962
  • 11
  • 63
  • 84