1

We all know that returning a reference to a local variable is a bad idea. However, I'm wondering if it's ever really a good idea to a return a reference at all and if it's possible to determine some good rules about when or when not to do it.

My problem with returning a reference is that the calling function needs to care about the lifetime of an object that shouldn't be its responsibility. As a contrived example:

#include <vector>

const int& foo() {
  std::vector<int> v = {1, 2, 3, 4, 5};
  return v[0];
}

int main(int argc, const char* argv[])
{
  const int& not_valid = foo();
  return 0;
}

Here, the vector goes out of scope at the end of foo, destroying its contents and invalidating any references to its elements. vector::operator[] returns a reference to the element, and so when this reference is further returned out of foo, the reference in main is dangling. I don't believe the const reference will extend the lifetime here because it's not a reference to a temporary.

As I said, this is a contrived example and the writer of foo probably wouldn't be so silly to try and return v[0] as a reference. However, it's easy to see how returning a reference requires the caller to care about the lifetime of an object it doesn't own. Pushing an element into a vector copies it, so then the vector is responsible for it. This problem doesn't exist for passing a reference argument because you know the function will complete before the caller continues and destroys the object.

I can see that returning a reference allows some nice array-like syntax like v[0] = 5 - but what's so bad about having a member function like v.set(index, value)? At least with this we wouldn't be exposing the internal objects. I know there may also be a performance increase from returning a reference, but with RVO, Named RVO (NRVO), and move semantics it is either negligible or non-existent.

So I've been trying to imagine under which situations returning a reference is ever truly safe, but I can't get my head around all the different permutations of ownership semantics that it might involve. Are there any good rules on when to do this?

Note: I know a better way to deal with ownership in vectors is to use smart pointers, but then you get the same problem with a different object - who owns the smart pointer?

Community
  • 1
  • 1
Joseph Mansfield
  • 108,238
  • 20
  • 242
  • 324
  • 1
    You don't have the same problem with a different object with smart pointers. Smart pointers are always stored by value, and whoever stores the value owns the pointer. I don't see the problem there. Also I don't know what you mean by returning a reference being "truly safe". Can you show why `operator[]` returning a reference is less "safe" than using a `set` function? – Seth Carnegie Oct 25 '12 at 22:02
  • Why just references? Same arguments are valid for pointers. – SomeWittyUsername Oct 25 '12 at 22:04
  • @SethCarnegie If I have a vector of smart pointers, access one of them (`v[0]`), keeping it around while the vector goes out of scope, I have the same problem as before. I've got a reference to a smart pointer that isn't valid. With the set function, I don't have to care about avoiding dangling references. – Joseph Mansfield Oct 25 '12 at 22:04
  • @icepack Very true. I just tend not to think with pointers very often. – Joseph Mansfield Oct 25 '12 at 22:05
  • 1
    @sftrabbit not unless you stored a reference to the smart pointer, which is a moot point since you _always_ store smart pointers by value or they are worthless. – Seth Carnegie Oct 25 '12 at 22:05
  • 1
    Just consider "local" transitive. `v` is local, and so are any values contained by `v`. – GManNickG Oct 25 '12 at 22:05
  • @sftrabbit also you are saying that you _access_ one of them with `v[0]`, but that problem is solved by the set function? One is for getting, one is for setting. – Seth Carnegie Oct 25 '12 at 22:07
  • @SethCarnegie I mean `operator[]` could return a copy and `set` would just modify the element. Two functions for two jobs. – Joseph Mansfield Oct 25 '12 at 22:08
  • 1
    @sftrabbit oh I see. In that case, RVO, NRVO, and move-semantics help you absolutely none, because you're not returning an expiring value from which to move, or using a temporary that can be constructed in-place via RVO. There is no way around copying the object (since you end up with _two_, not one), which can be expensive. – Seth Carnegie Oct 25 '12 at 22:12

2 Answers2

10

There are tons of good uses for returning a reference. One is, as you said, to emulate something like the native dereference operator:

struct Goo
{
    int & operator[](size_t i) { return arr[i]; }
    int & front()              { return arr[0]; }

    // etc.

private:
    int * arr;
};

Another use case is when you return a reference to a thing that was passed in. The typical example is a chainable operation like <<:

std::ostream & operator<<(std::ostream & os, Goo const & g)
{ 
    return os << g[3];
}

As a final example, here's a thread-safe global object:

Goo & get_the_object()
{
    static Goo impl;
    return impl;
}

References are an integral part of the language, and they may well be returned by a function call. As you said, it's important to understand the lifetime of objects, but that's always true and not a particular problem of returning references.

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
0

Personally, I like returning references to static variables when I want to implement the Singleton pattern

SomeClass& getTheSingleton()
{
    static SomeClass theInstance;
    return theInstance;
}

I dont have to write any logic involving whether or not some pointer is initialized, and it gives me some control over the order of static initialization

Ben
  • 1,287
  • 15
  • 24