75

Whilst refactoring some code I came across some getter methods that returns a std::string. Something like this for example:

class foo
{
private:
    std::string name_;
public:
    std::string name()
    {
        return name_;
    }
};

Surely the getter would be better returning a const std::string&? The current method is returning a copy which isn't as efficient. Would returning a const reference instead cause any problems?

Rob
  • 76,700
  • 56
  • 158
  • 197
  • Who said it was not efficent. A lot of work has been done on std::string to make make this operation efficient. There is a difference between passing a reference and a strincg but the actual difference is usually insignificant. – Martin York Sep 25 '08 at 18:03
  • 11
    @Loki: nothing says that it will be efficient, and in fact the most recent version of C++ says that it probably will be. the only optimization that would make that code reasonably efficient would be a reference-counted std::string. Many STL implementations don't do reference counting, though, as on modern multicore CPUs it's way slower to manage reference counts than you might think. So yes, returning a copy is much slower. Neither GCC's nor Microsoft's STL has done reference counting on std::string in many years. – Sean Middleditch Oct 26 '11 at 17:19
  • 3
    @seanmiddleditch: Always measure before you assume (that's all I am saying). There are also many other optimizations the compiler can apply: Inlining and copy elision would be two that reduce the cost to zero. – Martin York Oct 26 '11 at 18:06
  • 4
    @Loki: yes, that is good advice; always profile. Also good advice is to know the difference between "assumption" and "fact." I guarantee you that Rob's code can never result in zero overhead in any implementation of a conforming C++ compiler. The language mandates that the copy constructor must be invoked at least once when returning from foo::name(), even with inlining (has no effect on when constructors are invoked), copy elision (not applicable to this code), and RVO (might reduce extra copy constructor calls, but not eliminate all of them). – Sean Middleditch Oct 31 '11 at 07:40
  • @seanmiddleditch: Yes you are using section 12.8 par 32 for your argument (please specify the section when quote the standard in your argument). I apologies for my use of the term copy elision as that is at lot more specific than I anticipated. But I disagree with about it having zero overhead. You forget the As-If rule (1.9 para 1) can also be used here to remove any redundant code that has no side effect. Please try and measure it. – Martin York Oct 31 '11 at 18:56
  • 3
    @Loki: I _did_ try to measure it, on both GCC 4.6 and VC 10, both with maximum optimization turned on, and got the same results: the copy constructor is invoked. This is the "obvious" result even if you take the as-if rule into account, as std::string's copy constructor does in fact have side effects: it either creates a whole new character array and a copy of the string or it manipulates the reference count value (depending on implementation). Note that a simple-enough micro benchmark on a ref-counted impl might not give you "real" results here, which I ran into on GCC on my first try. – Sean Middleditch Nov 04 '11 at 18:36

12 Answers12

60

The only way this can cause a problem is if the caller stores the reference, rather than copy the string, and tries to use it after the object is destroyed. Like this:

foo *pFoo = new foo;
const std::string &myName = pFoo->getName();
delete pFoo;
cout << myName;  // error! dangling reference

However, since your existing function returns a copy, then you would not break any of the existing code.

Edit: Modern C++ (i. e. C++11 and up) supports Return Value Optimization, so returning things by value is no longer frowned upon. One should still be mindful of returning extremely large objects by value, but in most cases it should be ok.

Dima
  • 38,860
  • 14
  • 75
  • 115
  • Wouldn't break existing code, but wouldn't gain _any_ performance advantages. If problems were as easy to spot as in your example, that'd be fine, but it can often be much more tricky.. for example, if you have vector, and push_back, causing a resize, etc.. premature optimization, etc, etc. – tfinniga Sep 25 '08 at 18:36
  • Of course, you get a performance advantage. If you return the object, you pay the cost of creating an extra copy of the object, the temporary that is returned. You do not pay that cost if you return a reference. – Dima Sep 25 '08 at 19:06
  • 7
    @Dima: It is likely that most compilers will perform Named Value Return Value Optimisation in this case. – Richard Corden Sep 26 '08 at 09:35
  • 1
    @Richard: yes, I've read about that recently. But this only covers the case when you assign the returned value to something immediately, right? What if the returned value is passed to a function? Will the construction of the temporary still be optimized away? – Dima Sep 30 '08 at 19:02
  • I think you are correct on the fact that it only applies in the case you mentioned – BlueTrin Apr 19 '11 at 14:53
  • 1
    So what was the final verdict? Better to pass a reference back or just pass the copy? Let's assume the compiler is dumb, and a programmer has to actually be specific and thoughtful like back in the day before we became lazy :) – Volte Jul 09 '12 at 12:51
  • 2
    @Dale, I would advise returning a const reference. – Dima Jul 09 '12 at 15:10
  • Point about RVO is misleading. RVO will save you the unnecessary copies/moves, it will not save you from creation of a copy of a member. If your member string is 1KB string, you will still allocate 1KB when you return by value. – NoSenseEtAl Aug 24 '20 at 15:01
32

Actually, another issue specifically with returning a string not by reference, is the fact that std::string provides access via pointer to an internal const char* via the c_str() method. This has caused me many hours of debugging headache. For instance, let's say I want to get the name from foo, and pass it to JNI to be used to construct a jstring to pass into Java later on, and that name() is returning a copy and not a reference. I might write something like this:

foo myFoo = getFoo(); // Get the foo from somewhere.
const char* fooCName = foo.name().c_str(); // Woops!  foo.name() creates a temporary that's destructed as soon as this line executes!
jniEnv->NewStringUTF(fooCName);  // No good, fooCName was released when the temporary was deleted.

If your caller is going to be doing this kind of thing, it might be better to use some type of smart pointer, or a const reference, or at the very least have a nasty warning comment header over your foo.name() method. I mention JNI because former Java coders might be particularly vulnerable to this type of method chaining that may seem otherwise harmless.

Ogre Psalm33
  • 21,366
  • 16
  • 74
  • 92
  • 3
    Damn, good catch. Although I think C++11 respecced the lifespan of temporaries to make this code valid for any compliant compiler. – Mark McKenna May 08 '17 at 13:22
  • @MarkMcKenna That's good to know. I did a little searching, and this answer seems to indicate you still need to be a little careful, even with C++ 11: http://stackoverflow.com/a/2507225/13140 – Ogre Psalm33 May 15 '17 at 12:19
18

One problem for the const reference return would be if the user coded something like:

const std::string & str = myObject.getSomeString() ;

With a std::string return, the temporary object would remain alive and attached to str until str goes out of scope.

But what happens with a const std::string &? My guess is that we would have a const reference to an object that could die when its parent object deallocates it:

MyObject * myObject = new MyObject("My String") ;
const std::string & str = myObject->getSomeString() ;
delete myObject ;
// Use str... which references a destroyed object.

So my preference goes to the const reference return (because, anyway, I'm just more confortable with sending a reference than hoping the compiler will optimize the extra temporary), as long as the following contract is respected: "if you want it beyond my object's existence, they copy it before my object's destruction"

paercebal
  • 81,378
  • 38
  • 130
  • 159
10

Some implementations of std::string share memory with copy-on-write semantics, so return-by-value can be almost as efficient as return-by-reference and you don't have to worry about the lifetime issues (the runtime does it for you).

If you're worried about performance, then benchmark it (<= can't stress that enough) !!! Try both approaches and measure the gain (or lack thereof). If one is better and you really care, then use it. If not, then prefer by-value for the protection it offers agains lifetime issues mentioned by other people.

You know what they say about making assumptions...

rlerallut
  • 7,545
  • 5
  • 23
  • 21
8

Okay, so the differences between returning a copy and returning the reference are:

  • Performance: Returning the reference may or may not be faster; it depends on how std::string is implemented by your compiler implementation (as others have pointed out). But even if you return the reference the assignment after the function call usually involves a copy, as in std::string name = obj.name();

  • Safety: Returning the reference may or may not cause problems (dangling reference). If the users of your function don't know what they are doing, storing the reference as reference and using it after the providing object goes out of scope then there's a problem.

If you want it fast and safe use boost::shared_ptr. Your object can internally store the string as shared_ptr and return a shared_ptr. That way, there will be no copying of the object going and and it's always safe (unless your users pull out the raw pointer with get() and do stuff with it after your object goes out of scope).

Frank
  • 64,140
  • 93
  • 237
  • 324
4

I'd change it to return const std::string&. The caller will probably make a copy of the result anyway if you don't change all the calling code, but it won't introduce any problems.

One potential wrinkle arises if you have multiple threads calling name(). If you return a reference, but then later change the underlying value, then the caller's value will change. But the existing code doesn't look thread-safe anyway.

Take a look at Dima's answer for a related potential-but-unlikely problem.

Kristopher Johnson
  • 81,409
  • 55
  • 245
  • 302
3

It is conceivable that you could break something if the caller really wanted a copy, because they were about to alter the original and wanted to preserve a copy of it. However it is far more likely that it should, indeed, just be returning a const reference.

The easiest thing to do is try it and then test it to see if it still works, provided that you have some sort of test you can run. If not, I'd focus on writing the test first, before continuing with refactoring.

Airsource Ltd
  • 32,379
  • 13
  • 71
  • 75
1

Odds are pretty good that typical usage of that function won't break if you change to a const reference.

If all of the code calling that function is under your control, just make the change and see if the compiler complains.

17 of 26
  • 27,121
  • 13
  • 66
  • 85
1

Does it matter? As soon as you use a modern optimizing compiler, functions that return by value will not involve a copy unless they are semantically required to.

See the C++ lite FAQ on this.

christopher_f
  • 1,975
  • 1
  • 13
  • 12
  • Is return-by-value optimization so sophisticated that it would use a pointer/reference to the original member value i.o. of a copy? I don't think this is the case. Rbv optimization skips a few steps in returning a value but it would still call the copy ctor of cstring, albeit the in-place version. – QBziZ Sep 26 '08 at 20:00
  • If the function returns a copy of a class member variable as in this example, then it really will return a copy . The original member can't be destroyed in the process (unlike a function that returns by value and does create a temporary to return for example) – M.M Nov 12 '19 at 02:36
0

Depends what you need to do. Maybe you want to all the caller to change the returned value without changing the class. If you return the const reference that won't fly.

Of course, the next argument is that the caller could then make their own copy. But if you know how the function will be used and know that happens anyway, then maybe doing this saves you a step later in code.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
0

I normally return const& unless I can't. QBziZ gives an example of where this is the case. Of course QBziZ also claims that std::string has copy-on-write semantics which is rarely true today since COW involves a lot of overhead in a multi-threaded environment. By returning const & you put the onus on the caller to do the right thing with the string on their end. But since you are dealing with code that is already in use you probably shouldn't change it unless profiling shows that the copying of this string is causing massive performance problems. Then if you decide to change it you will need to test thouroughly to make sure you didn't break anything. Hopefully the other developers you work with don't do sketchy stuff like in Dima's answer.

Brett Hall
  • 913
  • 7
  • 12
-1

Returning a reference to a member exposes the implementation of the class. That's could prevent to change the class. May be usefull for private or protected methods incase the optimization is needed. What should a C++ getter return

Community
  • 1
  • 1