41

I'd much prefer to use references everywhere but the moment you use an STL container you have to use pointers unless you really want to pass complex types by value. And I feel dirty converting back to a reference, it just seems wrong.

Is it?

To clarify...

MyType *pObj = ...
MyType &obj = *pObj;

Isn't this 'dirty', since you can (even if only in theory since you'd check it first) dereference a NULL pointer?

EDIT: Oh, and you don't know if the objects were dynamically created or not.

Steve Guidi
  • 19,700
  • 9
  • 74
  • 90
Mr. Boy
  • 60,845
  • 93
  • 320
  • 589
  • 1
    I'd say you can deference a pointer all you want but I'm not sure what such an act would imply. – Edward Strange Aug 09 '10 at 21:52
  • @John: http://www.youtube.com/watch?v=sf-5RaFnh2U#t=2m14s :) – Merlyn Morgan-Graham Aug 09 '10 at 21:56
  • @John: Could you tell us whether the objects that you're storing pointers to are dynamically allocated? I assumed this was the case, but isn't an absolute need. – Steven Sudit Aug 09 '10 at 22:01
  • @John: Do *not* ever put yourself in a position to need to delete *anything*, it needs to be wrapped up. Use a `shared_ptr` implementation, a pointer container, or in C++0x a `unique_ptr`. Also, don't guess optimizations. Store them by value, and if performance becomes a *measured* problem, stores smart pointers to values instead. – GManNickG Aug 09 '10 at 22:04
  • @GMan: Imagine that the instances are owned by a `vector`, but pointers to some of these are added to another collection, such as a `map`. That's one of the ways that's safe, so long as the vector's scope is a superset of the map's. – Steven Sudit Aug 09 '10 at 22:06
  • @GMan... storing by value is just plain _wrong_. Copying a complex type implies you _mean_ to copy it as part of the logic, not that you can't be bothered to find a better way. Not to mention, the majority of C++ classes I see are probably not safe to copy-by-value anyway... I think that's probably the case across C++ code around the world. – Mr. Boy Aug 09 '10 at 22:06
  • @Steven: In which case he isn't in a position to need to delete anything. :) I'll admit I jumped to a conclusion. @John: No, it's plain *normal*. Please show me your real profiling results from a real application that say most of your time is spent copying objects around in a vector. Copying objects is no doubt often more expensive than copying a pointer, but you shouldn't underestimate your compiler and environment. Indeed, heap allocating things is quite a slow operation. Good, normal code *first*, optimized code *last*. Also, failed copy-semantics is not my problem or... – GManNickG Aug 09 '10 at 22:14
  • ..the containers. Saying "well it doesn't work with this poorly written code!" is hardly an argument. Fix the code, instead. – GManNickG Aug 09 '10 at 22:15
  • @GMan: I don't think John has ever said anything about performance, although he did mention that the types were "complex". He also said that the semantics of copying were incorrect for the objects he's referring to. Surely "good, normal" code must have the property that it is semantically correct. Writing code that isn't wrong is not a premature optimisation... – Steve Jessop Aug 09 '10 at 22:25
  • 2
    @Steve: The only objection I can deduce from his argument is performance is a concern (hence the "really want to pass complex types by value" and finding a "better way" to copy things. Writing code to *hack* wrong code is not good code, it's a hack. One should fix the class with broken copy-semantics. – GManNickG Aug 09 '10 at 22:31
  • @GMan: OK, look at my example. I don't "really want to pass complex types by value", because my function `bar` *modifies* its input. Pass-by-value is useless. I would copy it only if I "meant to copy it as part of the logic". There is no question of optimisation, premature or otherwise - pointers *do exactly what I want*. The only observable difference between me and John is that I've shown you my code (perhaps because I've invented a very simple case, and he hasn't, although that's pure speculation). – Steve Jessop Aug 09 '10 at 22:49
  • @Steve: Oh, we're discussing different things. D: I thought we were still on design choices when owning resources, the whole use a smart container line and preference of choice between by-value and by-pointer. I wasn't talking about any individual situation, I agree with you fully there. I see how this would seem confusing, then. :S I'm sorry for not seeing we were talking about a particular situation where we don't own the resources. (I though John's "storing by value is just plain wrong." comment was a general design claim, not something in his case.) I revoke my comments in that regard. – GManNickG Aug 09 '10 at 22:56
  • Yeah, that's what I think, anyway: he meant "in the case I need to code, copying is wrong". I think that talking about non-copyable or "complex" objects has just confused the issue. Although: I did once write some code that accidentally passed something stupid like a largeish map of maps by value, and I'd finished fixing it long before it had finished running. So I do think *sometimes* you can tell in advance not to pass by value. Anyway, will let him go back to speaking for himself now ;-) – Steve Jessop Aug 09 '10 at 23:01
  • @GMan: Looks like we're on the same page now. – Steven Sudit Aug 09 '10 at 23:06
  • @Steven: Indeed, apologies for more conclusion jumping. I'll admit I've been on a short "fuse" so-to-speak lately. – GManNickG Aug 09 '10 at 23:08
  • @GMan: funny you should say that, I've been mostly off SO for a while, and coming back to it I think a higher proportion of C++ question titles than before are making me think, "this is so stupid/boring/pointless I'm not even going there". Maybe it's the weather. Although my other theory is that programming Python is making me lose patience with C++ in general ;-) – Steve Jessop Aug 09 '10 at 23:14
  • @Steve: I noticed you seemed to be less as active. I agree the "hard" questions are lacking a bit. :) @OP: Though I do maintain "the moment you use an STL container you have to use pointers unless you really want to pass complex types by value" sounds like a generic design claim. :) – GManNickG Aug 09 '10 at 23:15
  • @GMan: true, but I took "really want to pass complex types by value" as meaning, "genuinely want copies of the objects". i.e, something that sometimes happens and sometimes doesn't. I guess you took it to be a rhetorical device, meaning, "Copy a complex type? Not while I'm still breathing!"? I considered that interpretation too, but like I said, I *think* that a bit of performance anxiety over "complex" types is confusing the issue even though it needn't be relevant. The same is true of `ints` - you need a container of pointers unless you really want to copy them. Which you often do. – Steve Jessop Aug 09 '10 at 23:19
  • @Steve: Yeah, I see that way now. Tsk tsk internet. – GManNickG Aug 09 '10 at 23:26
  • I _never_ saw a book or article on C++ which advocated passing objects by value unless they are pretty minimal. It's not even about performance, it's just silly to do it and smacks of spending too much time in Java/C#. – Mr. Boy Aug 10 '10 at 07:42
  • On the "classes often are not correct for copying"... my point is not that my code is like this but that code in general I come across is. It's all well and good to be idealistic but pragmatism is preferable in my book... just like how all browsers _should_ be the same but you don't write HTML on that assumption. – Mr. Boy Aug 10 '10 at 07:43
  • p.s: if this is such a trivial topic why's it got so many comments and discussions? Simple questions typically get one or two answers stating the answer and then die. – Mr. Boy Aug 10 '10 at 07:50
  • @John: Right, the usual pattern is to pass non-trivial objects by reference. Preferably, a const reference. – Steven Sudit Aug 10 '10 at 08:44
  • I've seen it like this: `ClassA& myFunc() { ASSERT(m_ptr); return *m_ptr; }` EEEK! – Mike Caron Jun 18 '12 at 19:53

7 Answers7

22

Ensure that the pointer is not NULL before you try to convert the pointer to a reference, and that the object will remain in scope as long as your reference does (or remain allocated, in reference to the heap), and you'll be okay, and morally clean :)

Merlyn Morgan-Graham
  • 58,163
  • 16
  • 128
  • 183
  • 2
    How can a dynamically allocated object be in scope? – Billy ONeal Aug 09 '10 at 21:54
  • Null references are (inconveniently) undefined. – Steven Sudit Aug 09 '10 at 21:55
  • 2
    @Billy: We don't know that they're dynamically allocated, just that we're pointing at them. – Steven Sudit Aug 09 '10 at 21:55
  • @Steven: Actually we do. The OP explicitly referred to objects inside of STL containers. – Billy ONeal Aug 09 '10 at 21:56
  • 4
    @Billy: Re-read what they wrote. They're using STL containers but they don't want to store it by value in the container, necessitating a copy constructor on insertion. They want a container of (smart) pointers to values that may be dynamically allocated, or may not be (such as a static array, for example). – Steven Sudit Aug 09 '10 at 21:59
  • @Steven: If it's in a smart pointer owned by an STL container, you can be assured it's dynamically allocated as well. Smart pointers don't own stack allocated objects. – Billy ONeal Aug 09 '10 at 23:15
  • 1
    Smart pointers can have null destructors. Which is a good thing if you have a vector of smart pointers to objects, but want to store a stack allocated object for some reason. – gnud Aug 09 '10 at 23:38
  • @Bill: I think gnud's example is fair, but you're generally correct. Just tread the first parentheses as showing an optional trait. – Steven Sudit Aug 09 '10 at 23:48
  • "In scope" is probably not the correct term. I suspect Merlyn meant "the object will remain valid/allocated ..." – Max Lybbert Aug 10 '10 at 00:12
  • @Billy, Max: I didn't mean syntactical scope, I meant logical scope. If the object exists, whether on the stack or on the heap, it is in scope. You could argue that a leaked object is no longer in any form of scope... – Merlyn Morgan-Graham Aug 10 '10 at 02:29
  • @Merlyn: Ah. Note that scope refers specifically to a set of braces {}, hence the confusion ;) Perhaps change "in scope" to "valid"? – Billy ONeal Aug 10 '10 at 03:01
  • @Billy: Okay, fine :) Now mentions the heap – Merlyn Morgan-Graham Aug 10 '10 at 05:16
  • @Merl: Scope usually refers to visibility, whereas we're both talking about lifespan. – Steven Sudit Aug 10 '10 at 07:23
  • I've used scope in both senses (visibility and lifetime). But it looked like it was causing confusion in this case. – Max Lybbert Aug 10 '10 at 07:52
17

Initialising a reference with a dereferenced pointer is absolutely fine, nothing wrong with it whatsoever. If p is a pointer, and if dereferencing it is valid (so it's not null, for instance), then *p is the object it points to. You can bind a reference to that object just like you bind a reference to any object. Obviously, you must make sure the reference doesn't outlive the object (like any reference).

So for example, suppose that I am passed a pointer to an array of objects. It could just as well be an iterator pair, or a vector of objects, or a map of objects, but I'll use an array for simplicity. Each object has a function, order, returning an integer. I am to call the bar function once on each object, in order of increasing order value:

void bar(Foo &f) {
    // does something
}

bool by_order(Foo *lhs, Foo *rhs) {
    return lhs->order() < rhs->order();
}

void call_bar_in_order(Foo *array, int count) {
    std::vector<Foo*> vec(count);  // vector of pointers
    for (int i = 0; i < count; ++i) vec[i] = &(array[i]);
    std::sort(vec.begin(), vec.end(), by_order);
    for (int i = 0; i < count; ++i) bar(*vec[i]); 
}

The reference that my example has initialized is a function parameter rather than a variable directly, but I could just have validly done:

for (int i = 0; i < count; ++i) {
    Foo &f = *vec[i];
    bar(f);
}

Obviously a vector<Foo> would be incorrect, since then I would be calling bar on a copy of each object in order, not on each object in order. bar takes a non-const reference, so quite aside from performance or anything else, that clearly would be wrong if bar modifies the input.

A vector of smart pointers, or a boost pointer vector, would also be wrong, since I don't own the objects in the array and certainly must not free them. Sorting the original array might also be disallowed, or for that matter impossible if it's a map rather than an array.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
5

No. How else could you implement operator=? You have to dereference this in order to return a reference to yourself.

Note though that I'd still store the items in the STL container by value -- unless your object is huge, overhead of heap allocations is going to mean you're using more storage, and are less efficient, than you would be if you just stored the item by value.

Billy ONeal
  • 104,103
  • 58
  • 317
  • 552
  • @Billy: You're probably right about the overhead. Having said that, there are occasionally objects that cannot be copied. – Steven Sudit Aug 09 '10 at 22:01
  • Wouldn't the overhead depend on what type of container you are using? For instance, a `std::vector` reserves memory in chunks and doesn't do a separate allocation for every element you add to it. A `std::set` or `std::map` implementation could very well perform a separate allocation for each element, and thus incur the overhead you speak of. Or perhaps there is something else I'm not considering. Please elaborate. – A. Levy Aug 09 '10 at 22:10
  • @A. Levy: A `vector` will reallocate as needed, copying instances from the old buffer to the new. But, yes, it'll allocate a contiguous range and use placement `new` to instantiate copies in these locations. A map will likely need a single block for each node, but then again, it's not likely to ever copy a node. – Steven Sudit Aug 09 '10 at 23:03
  • @Steven: Unless you copy the map itself. – Billy ONeal Aug 09 '10 at 23:12
  • @Bill: That's true. I was limiting the scope to copies after the initial insertion as a result of other insertions and deletions. – Steven Sudit Aug 09 '10 at 23:44
  • @Steven: Re: First comment: Yes. If the object cannot be copied, by all means store it as a pointer. Plenty of cases where it does make sense to copy though. – Billy ONeal Aug 09 '10 at 23:48
  • @Bill: Good thing copying is the default behavior. – Steven Sudit Aug 09 '10 at 23:54
  • @A Levy: the `vector` reallocates, that's why you preferably use a `deque` whenever you don't need contiguity (for C-API compatibility). – Matthieu M. Aug 10 '10 at 06:59
  • 2
    It also means you class has to have an empty/default ctor. That can mean writing extra code just to allow you to put them in containers, when an object in this state is invalid. Just seems messy. – Mr. Boy Aug 10 '10 at 07:45
2

My answer doesn't directly address your initial concern, but it appears you encounter this problem because you have an STL container that stores pointer types.

Boost provides the ptr_container library to address these types of situations. For instance, a ptr_vector internally stores pointers to types, but returns references through its interface. Note that this implies that the container owns the pointer to the instance and will manage its deletion.

Here is a quick example to demonstrate this notion.

#include <string>
#include <boost/ptr_container/ptr_vector.hpp>

void foo()
{
    boost::ptr_vector<std::string> strings;

    strings.push_back(new std::string("hello world!"));
    strings.push_back(new std::string());

    const std::string& helloWorld(strings[0]);
    std::string& empty(strings[1]);
}
Steve Guidi
  • 19,700
  • 9
  • 74
  • 90
2

I'd much prefer to use references everywhere but the moment you use an STL container you have to use pointers unless you really want to pass complex types by value.

Just to be clear: STL containers were designed to support certain semantics ("value semantics"), such as "items in the container can be copied around." Since references aren't rebindable, they don't support value semantics (i.e., try creating a std::vector<int&> or std::list<double&>). You are correct that you cannot put references in STL containers.

Generally, if you're using references instead of plain objects you're either using base classes and want to avoid slicing, or you're trying to avoid copying. And, yes, this means that if you want to store the items in an STL container, then you're going to need to use pointers to avoid slicing and/or copying.

And, yes, the following is legit (although in this case, not very useful):

#include <iostream>
#include <vector>

// note signature, inside this function, i is an int&
// normally I would pass a const reference, but you can't add
// a "const* int" to a "std::vector<int*>"
void add_to_vector(std::vector<int*>& v, int& i)
{
    v.push_back(&i);
}

int main()
{
    int x = 5;
    std::vector<int*> pointers_to_ints;

    // x is passed by reference
    // NOTE:  this line could have simply been "pointers_to_ints.push_back(&x)"
    // I simply wanted to demonstrate (in the body of add_to_vector) that
    // taking the address of a reference returns the address of the object the
    // reference refers to.
    add_to_vector(pointers_to_ints, x);

    // get the pointer to x out of the container
    int* pointer_to_x = pointers_to_ints[0];

    // dereference the pointer and initialize a reference with it
    int& ref_to_x = *pointer_to_x;

    // use the reference to change the original value (in this case, to change x)
    ref_to_x = 42;

    // show that x changed
    std::cout << x << '\n';
}

Oh, and you don't know if the objects were dynamically created or not.

That's not important. In the above sample, x is on the stack and we store a pointer to x in the pointers_to_vectors. Sure, pointers_to_vectors uses a dynamically-allocated array internally (and delete[]s that array when the vector goes out of scope), but that array holds the pointers, not the pointed-to things. When pointers_to_ints falls out of scope, the internal int*[] is delete[]-ed, but the int*s are not deleted.

This, in fact, makes using pointers with STL containers hard, because the STL containers won't manage the lifetime of the pointed-to objects. You may want to look at Boost's pointer containers library. Otherwise, you'll either (1) want to use STL containers of smart pointers (like boost:shared_ptr which is legal for STL containers) or (2) manage the lifetime of the pointed-to objects some other way. You may already be doing (2).

Max Lybbert
  • 19,717
  • 4
  • 46
  • 69
1

If you want the container to actually contain objects that are dynamically allocated, you shouldn't be using raw pointers. Use unique_ptr or whatever similar type is appropriate.

Steven Sudit
  • 19,391
  • 1
  • 51
  • 53
  • 1
    `unique_ptr` is only available in C++0x, which can be prohibitive. – Billy ONeal Aug 09 '10 at 21:55
  • @Billy: "or whatever similar type is appropriate". We have `auto_ptr` right now, but Boost offers a few better alternatives. – Steven Sudit Aug 09 '10 at 21:56
  • 3
    @Steven: `auto_ptr` cannot be stored inside STL containers. – Billy ONeal Aug 09 '10 at 21:57
  • @Billy: There are many problems with `auto_ptr`, which is why I recommended alternatives while mentioning that the standard smart pointer currently available is just `auto_ptr`. – Steven Sudit Aug 09 '10 at 22:00
  • 5
    @John: [`auto_ptr` does not have the correct copy semantics to be placed in a container.](http://stackoverflow.com/questions/111478/why-is-it-wrong-to-use-stdauto-ptr-with-stl-containers). – James McNellis Aug 09 '10 at 22:00
  • @James: Thanks. @John: If you don't know whether they're dynamically allocated, then you can't have the STL container own them. This may be ok, though, as you can just use raw pointers. – Steven Sudit Aug 09 '10 at 22:04
  • @Steven: of course you can. A 3rd-party library might have methods that involve containers of pointers, and doesn't tell you where the stored objects come from. – Mr. Boy Aug 09 '10 at 22:08
  • @John: You're going to have to explain what you mean. – Steven Sudit Aug 09 '10 at 22:09
  • @Steven: I'm pretty sure he's saying that the vector does not own the objects, and is not responsible for freeing them. This seems to have provoked a great deal of disbelief, but it does happen. – Steve Jessop Aug 09 '10 at 22:30
  • @John: Even if the library only gives you pointers, it's also going to give you a function that lets you pass that pointer in to be deallocated. If this is the case, then you need to make your own `unique_ptr` variant which replaces the `delete` with a call to that function. – Steven Sudit Aug 09 '10 at 23:01
  • @Steve: It does happen, indeed, but I think it can be seen as a special case that's not very different. – Steven Sudit Aug 09 '10 at 23:01
  • @Steven: You cannot implement your own `unique_ptr`, because it relies on move semantics, which were introduced in C++0x. In which case you might as well use `std::unique_ptr` in any case. – Billy ONeal Aug 09 '10 at 23:09
  • 1
    @Bill: Please take that as a "ferinstance". There are many perfectly good smart pointers available in old C++ if you use Boost. Here's a nice summary: http://www.codesynthesis.com/~boris/blog/2010/05/24/smart-pointers-in-boost-tr1-cxx-x0/ – Steven Sudit Aug 09 '10 at 23:53
  • @Steven... say `MyBigClass` contains a member of type `MyMediumClass`. Now I can in theory be passed a `vector` built from the members of every `MyBigClass` – Mr. Boy Aug 10 '10 at 07:49
  • @John: My parser can't decode that sentence. Are you asking about a vector of pointers to objects contained within larger objects? If so, there's nothing tricky about it. So long as the lifespan of the larger objects exceeds that of the vector, it'll work fine. – Steven Sudit Aug 10 '10 at 08:42
1

There's nothing wrong with it, but please be aware that on machine-code level a reference is usually the same as a pointer. So, usually the pointer isn't really dereferenced (no memory access) when assigned to a reference. So in real life the reference can be 0 and the crash occurs when using the reference - what can happen much later than its assignemt.

Of course what happens exactly heavily depends on compiler version and hardware platform as well as compiler options and the exact usage of the reference.

Officially the behaviour of dereferencing a 0-Pointer is undefined and thus anything can happen. This anything includes that it may crash immediately, but also that it may crash much later or never.

So always make sure that you never assign a 0-Pointer to a reference - bugs likes this are very hard to find.

Edit: Made the "usually" italic and added paragraph about official "undefined" behaviour.

IanH
  • 3,968
  • 2
  • 23
  • 26
  • 2
    Hmm, does the C++ standard *require* references to be implemented as direct pointers? – Steven Sudit Aug 09 '10 at 23:55
  • No, but most compiler usually do so. So often derefencering a 0-Pointer and assigning it to a reference is possible in practice and may lead to strange crashes at other locations. I update my answer to state this more clearly. – IanH Aug 10 '10 at 07:33