12

When the c++ compiler generates very similar assembler code for a reference and pointer, why is using references preferred (and considered safer) compared to pointers?

I did see

EDIT-1:

I was looking at the assembler code generated by g++ for this small program:

int main(int argc, char* argv[])
{
  int a;
  int &ra = a;
  int *pa = &a;
}
Community
  • 1
  • 1
yasouser
  • 5,113
  • 2
  • 27
  • 41
  • @YouKnowWho: I'd take this with a grain of salt --> a compiler do few errors, but they generally have very interesting effets :) – Matthieu M. Jan 17 '11 at 17:09
  • 2
    You didn't provide a citation for "using references [is] preferred (and considered safer) compared to pointers". I'd wager that many people on this forum might disagree with that assertion. – Lightness Races in Orbit Jan 17 '11 at 17:11
  • 1
    @Tomalak: I don't know of any concrete reference. And that's what I am trying to get cleared through this question. – yasouser Jan 17 '11 at 18:15

9 Answers9

22

It's considered safer because a lot of people have "heard" that it's safer and then told others, who now have also "heard" that it's safer.

Not a single person who understands references will tell you that they're any safer than pointers, they have the same flaws and potential to become invalid.

e.g.

#include <vector>

int main(void)
{
    std::vector<int> v;
    v.resize(1);

    int& r = v[0];
    r = 5; // ok, reference is valid

    v.resize(1000);
    r = 6; // BOOM!;

    return 0;
}

EDIT: Since there seems to be some confusion about whether a reference is an alias for an object or bound to a memory location, here's the paragraph from the standard (draft 3225, section [basic.life]) which clearly states that a reference binds to storage and can outlive the object which existed when the reference was created:

If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:

  • the storage for the new object exactly overlays the storage location which the original object occupied, and
  • the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
  • the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
  • the original object was a most derived object of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).
Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • 3
    +1 simply for this: "It's considered safer because a lot of people have "heard" that it's safer and then told others, who now have also "heard" that it's safer." – yasouser Jan 17 '11 at 18:10
  • 7
    "Not a single person who understands references will tell you that they're any safer than pointers" ---> I understand them (as I'm sure others do too) and I still say they're safer. Note that "safer" does *not* mean "safe"... you can still shoot yourself in the foot, it's just harder. – user541686 Jan 17 '11 at 20:12
  • "a reference that referred to the original object, or the name of the original object will automatically refer to the new object." This quote supports what I have been saying all along, that a reference to an object is no more a "address" than the object itself, i.e., what the quote calls "the name." References are addresses the same way that names are addresses. Both are symbols for the address of a data object in memory. But "the name" of a pointer represents the address of an address, not the address of any data object. – ThomasMcLeod Jan 18 '11 at 04:54
  • 1
    @ThomasMcLoed: Perhaps you missed that pointers fall into exactly the same category -- all three, a pointer, a reference, and the variable name, *all address the **storage location**, not the object*. Don't try to equate "the name" with "the object itself", when the standard clearly describes different lifetimes. – Ben Voigt Jan 18 '11 at 05:01
  • 1
    @masters3d: Too bad he makes about as many mistakes while explaining it there are in the "fact" he debunks. I got as far as calling `if (ptr)` "fault-tolerance", which is just absurd. But yes, other people have noticed that references are made safe by the compiler, they have to be made safe by correct code -- exactly like pointers. – Ben Voigt Jun 05 '16 at 00:57
8

It depends how you define "safer".

The compiler won't let you create an uninitialised reference, or one that points to NULLness, and it won't let you accidentally make your reference refer someplace else whilst you're using it. These stricter rules also mean that your compiler can give you some more warnings about common mistakes, whereas with pointers it's never really sure whether you meant to do what you did, or not.

On the other hand, the transparency of the syntax -- namely, what Alexandre C. mentioned about what it looks like at the call-site to pass an argument as a reference -- makes it quite easy not to realise that you're passing a reference. Consequently, you might not realise that you're supposed to maintain ownership and lifetime of the argument, or that your argument may get permanently modified.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • 1
    Of course, you can declare a const pointer (I don't mean pointer-to-const), which will require you to initialize it and prevent making it refer somewhere else. – Ben Voigt Jan 17 '11 at 17:13
  • 3
    I'm just saying it's not an exclusive benefit of using a reference, you can get the same behavior from a pointer (by declaring it const). – Ben Voigt Jan 17 '11 at 17:18
  • Regarding uninitialised references, `int *const p` also has that feature. Regarding nullness, `[[gnu::nonnull]] int *p` also has that feature. BTW, the compiler doesn't complain if you assign a (possibly-NULL) pointer to a nonnull pointer nor if you assign it to a reference. So in the end, a reference is as unsafe (and uglier than, IMO) as `[[gnu::nonnull]] int *const p`. – alx - recommends codidact Nov 16 '21 at 15:58
  • Re: *"whereas with pointers it's never really sure whether you meant to do what you did, or not"*: except if you use Clang's `_Nonnull`, in which case you can get even more warnings with pointers than with references. References are a disguised `[[gnu::nonnull]] *const` pointer. – alx - recommends codidact Nov 16 '21 at 18:32
7

Because references (which are simply an alias for other variable) can't be NULL by definition, providing an inherent layer of safety.

user541686
  • 205,094
  • 128
  • 528
  • 886
  • 9
    Forbidding NULLs doesn't make references safe. In fact, references are not any safer than pointers, and the fact that the indirection is hidden makes them more dangerous. – Ben Voigt Jan 17 '11 at 17:06
  • they can't be **created** as null, but they are not safer than pointers. – Alexandre C. Jan 17 '11 at 17:07
  • 2
    @Alexandre: They can be created as null (dereferencing a pointer), but the most important point is that they can refer to garbage too. – Matthieu M. Jan 17 '11 at 17:10
  • 5
    @Alexandre C.: They cannot be NULL unless you have previously invoked UB, at which point all bets are off, therefore it's not the fault of the reference. :) – Lightness Races in Orbit Jan 17 '11 at 17:11
  • @Matthieu, @Tomalak: You can indeed write `int& x = *((int*)0)`, but it is quite difficult to write it unintentionally. – Alexandre C. Jan 17 '11 at 17:11
  • 2
    @Matthie - no, they can't. Dereferencing a NULL pointer is UB. A program that does this is ill-formed already. Thus the lesson to learn there is NOT to dereference invalid or null pointers. – Edward Strange Jan 17 '11 at 17:13
  • @Noah: It thought it was the fact of using the dereferenced pointer that was UB, not merely dereferencing it. Sorry if I recalled wrong. – Matthieu M. Jan 17 '11 at 17:22
  • 1
    @Noah: A program that passes NULL to a function documented to require a valid pointer is also ill-formed. There's absolutely no difference. – Ben Voigt Jan 17 '11 at 17:23
  • @Ben Voigt - No it isn't. "ill-formed" is a technical term here, not one of opinion. If that function CHECKS before using the pointer and then throws or something...program is fine. – Edward Strange Jan 17 '11 at 17:34
  • @Noah: Dereferencing a null pointer doesn't make a program ill-formed. "Ill-formed" is a status of the source code, while "behavior" is a status of the running environment. @Matt: It was [proposed](http://stackoverflow.com/questions/2474018/when-does-invoking-a-member-function-on-a-null-instance-result-in-undefined-behav), never happened. – GManNickG Jan 17 '11 at 20:05
  • 4
    @Ben: No, there is a difference. This *function* (note the unit) is guaranteed to work: `void f(int& i) { i = 0; }` each and every time, no matter what. If I have a bug, I never need to consider this function the cause. Contrarily, this function *can* break: `void f(int* i) { *i = 0; }`, and passing a null pointer to the function *is* something I'd need to consider in the face of a bug. What references do is make my functions self-contained, and put the onus on the *user* of the function to handle any pointers he may have correctly. – GManNickG Jan 17 '11 at 20:07
  • 1
    @GMan: What difference? Both the function and the reference case work properly if a valid argument is passed in, and fail if not. Neither one contains a bug. In both cases, the onus is on the caller to meet the contract. – Ben Voigt Jan 17 '11 at 22:02
  • 1
    @Ben: So you're saying there's no difference because they both work with valid arguments, and both fail with invalid arguments? Ok, they both work properly given a valid argument, but you're hand-waving over the statement "and fail if not". The reference version *doesn't contain that possibility*. There is a difference, and it's the the set of possible invalid inputs for the reference version is empty, while the pointer version has one. – GManNickG Jan 17 '11 at 22:05
  • 1
    @GMan: If a wild pointer gets passed in, you get UB. If a dangling reference gets passed in, you get UB. This is the contract: "When calling `f`, you must provide a valid buffer of type `int`". Violating that contract has the same effect for both versions. If you want to argue the point that it's possible for a program whose behavior is otherwise well-defined to become UB by the inclusion of the pointer version of `f`, I agree. Below is a function that can be included into a program that is otherwise well-defined behavior and make it UB, while the pointer equivalent never introduces UB. – Ben Voigt Jan 17 '11 at 22:15
  • 1
    Should have been `const int& f(const int& r) { return r; }`. I hit the limit on comment length. – Ben Voigt Jan 17 '11 at 22:17
  • 1
    @Ben: This contract argument is extremely arbitrary to me. Why not make the contract "don't do anything wrong"? The point isn't that you can specify rules humans should follow, it's that the language provides a mechanism to *ensure* it. I'm absolutely certain the reference version of `f` is well-defined, but that isn't the case for the pointer version. Consider for the moment that we sub-divide a program as follows: statements (the atomic unit), functions (collections of statements), and programs (collection of functions). If any statement *may* fail upon execution... (1/?) – GManNickG Jan 17 '11 at 22:27
  • ...then we call that statement unsafe. A function that contains an unsafe statement is an unsafe function, and a program that contains an unsafe function is an unsafe program. The reference version of `f` is *safe*, while the pointer version is *unsafe*. That means any functions that use the pointer version of `f` are unsafe, while functions that use the reference version may remain safe. Ultimately, using the pointer version, you make the program unsafe by having this statement that can potentially cause UB, while you cannot with the reference version. If *every* function is safe... (2/?) – GManNickG Jan 17 '11 at 22:31
  • ...in a program consisting of a statement and then a call to those functions, then a bug can only be introduced by that statement. Contrarily, unsafe functions mean anything that use it is suspect. So when I say the onus is on the caller, I mean that if we compare `int main() { volatile int *i = 0; f(i); }` (obviously using the pointer version), with this one `int main() { volatile int *i = 0; f(*i); }`, we see that because `f` in the first program has an unsafe statement, that it *allowed* UB, while in the second one, it did not allow UB: that occurred in the calling code. (3/3) – GManNickG Jan 17 '11 at 22:35
  • 1
    @GMan: Ok, and then is it the function or the calling code which is *unsafe* for `const int& f(const int& r) { return r; } int main(void) { int x = 0; const int& r2 = f(x); const int& r3 = f(x+1); std::cout << r2 << r3; }`? Now the reference version conditionally triggers UB (while being perfectly well-formed), and the obvious pointer equivalent only has problems if UB occurred in the calling code, it never introduces UB. I stand by my claim that references are not safer than pointers, both have stronger preconditions on use than simply "properly constructed". – Ben Voigt Jan 17 '11 at 23:25
  • @Ben: I get your point, but does that trigger UB? `x + 1` is a temporary, yes, which is returned to by the function, but does the binding to `r3` extend it? – GManNickG Jan 17 '11 at 23:29
  • @Ben: `const int& r3 = f(x+1);`? Really? You're passing an r-value as an l-value? That's bad practice in *any* language, even if it's allowed (hence why languages like C# disallow it). If you don't engage in writing code that you **know** will be problematic even without context (i.e. if you never pass r-values as references, never use uninitialized variables, etc.), references will be safer than pointers. Of course, if you choose to shoot yourself in the foot on purpose, then obviously nothing will stand in your way... not even using Java, which doesn't contain pointers. – user541686 Jan 17 '11 at 23:29
  • @Lambert, check the code, it's an r-value reference. Maybe I should have used a different function name than GMan did, since it's a separate example. – Ben Voigt Jan 17 '11 at 23:37
  • @GMan: No, its lifetime isn't extended. That's the problem. The `main` function is perfectly valid with `const int& f(const int& r) { static int a; a = r; return a; }`, and likewise the `f` function is perfectly valid when used the first time in `main`. My function with references appears to be exactly as safe as your function which dereferences a pointer parameter. – Ben Voigt Jan 17 '11 at 23:42
  • @Lambert: But whatever your opinion of `const int& r3 = f(x+1);`, you can't argue with the fact that it's an example of references being less safe than pointers. – Ben Voigt Jan 17 '11 at 23:45
  • @Ben: ... the point isn't to try to think of examples that cause problems. We know that *any sort of protection in pretty much any language* can be bypassed: permissions can by bypassed by reflection, Java can suddenly become unsafe with the use of the Java Native Interface, casts can remove `const`ness and can create null references, etc. The point is *how easy* is this for a reasonably experienced programmer to do. (1/?) – user541686 Jan 17 '11 at 23:45
  • ... Experience (at least mine -- I don't know about yours) says that you shouldn't **EVER** pass an r-value to a function that expects an l-value, even if the compiler generates a temporary variable for you. Some languages forbid this entirely (e.g. C#), and some don't (e.g. Visual Basic), and for good reasons like this. Of course, if you pretend like you don't know anything, then it's not that hard to come up with silly examples. But a reasonable programmer would find it harder to shoot himself in the foot with references **by accident** (*not* on purpose). (2/2... hopefully.) – user541686 Jan 17 '11 at 23:49
  • @Ben: If I show you a serial killer with a knife and I also show you a policeman with a shotgun, is it the cop that's more dangerous to you or the bad guy? Your argument is just like saying that you'd rather be around the serial killer because the cop is much more dangerous, since he can one-hit KO five people at once, while the bad guy has to do a lot more work for the same task. And you may "prove" this by showing me a cop that killed five people when chasing a killer. That may be true, but in practice, I doubt that it's the actual way you'd reason this in your head... right? – user541686 Jan 17 '11 at 23:54
  • @Ben: Okay, just checking. So then yes, this function is dangerous, but I don't think that means we can say references and pointers are equally likely to be unsafe. (I will say according to my definitions/arguments, you're right and I failed to distinguish that.) I'd like to say, though, that your function also doesn't do anything. In other words, in practice with real programs, you'll find a lot of uses for references that, if converted to pointers, would make the program unsafe, unless additional checks were added. – GManNickG Jan 17 '11 at 23:56
  • @Ben: Perhaps another route, using your contract argument: references are a *language-level contract* to use a valid object, while a pointer has no such implicit contract. Looking at our original two `f`'s, nothing about the pointer version lets me know I cannot pass `nullptr`. – GManNickG Jan 17 '11 at 23:57
  • @GMan: I assume you mean looking at only the signature of the pointer version. But I don't even have to look at the signature of the pointer version to know that the buffer I pass might be modified, I can tell that from the call site. And nothing about the signature of my counter-example says that passing in a temporary is dangerous. Both have their place, there are instances where it is more appropriate to use one or the other. And "use references whenever/wherever possible" is definitely not the right rule. – Ben Voigt Jan 18 '11 at 01:16
  • @Lambert: To an experienced programmer, pointers cry out "Here be dragons. Defend thyself against lifetime management issues!", while references suggest that everything is well taken care of even though it's not. References are ideal for operator overloading and avoiding needless copying of input arguments. I don't see any valid reason to use non-const reference parameters, or return a reference expect from an operator definition. If a function requires a parameter to be an l-value, it should accept a pointer. The convention in C++ is that const references do accept r-values. – Ben Voigt Jan 18 '11 at 01:21
  • @Ben: "Nothing about the signature of my counter-example says that passing in a temporary is dangerous." --> How I look at it is, *there's a reason* the callee wanted the variable passed by reference. So if you try to fool the callee by making a variable that's basically passed by value, you should *expect* that something will go wrong... even without knowing the body of the code. (Here's a question for you: why do you think C# chose not to allow passing `ref` parameters this way?) – user541686 Jan 18 '11 at 01:27
  • @Ben: Another question: How often do you, personally, ACTUALLY run into trouble with invalid references? And how often do you do the same with pointers? (Edit: Since I like comparing things, here's another comparison: which is more dangerous, a drunk driver or a sober one? If a sober driver can kill passengers just as easily as a drunk one can, then why is DUI bad? You're applying the same comparison to pointers/references...) – user541686 Jan 18 '11 at 01:30
  • @Lambert: Because I understand that references don't solve all the world's problems, I use them quite sparingly. And have a correspondingly low number of problems with them. And with pointers I don't remember the last time I had a pointer problem that using a reference would have avoided, maybe 5 or 6 years ago? As for why C# doesn't allow in reference arguments, I suspect because it was judged (rightly) as a lot of work for very little benefit. – Ben Voigt Jan 18 '11 at 01:39
  • @Lambert: Please note that C# has nothing equivalent to a C++ reference -- C# references to objects can be rebound and can be null. C# reference parameters can't, but like pointers they are not transparent. – Ben Voigt Jan 18 '11 at 01:41
  • @Lambert: I think a better analogy would be a drunk driver blaring a bullhorn vs a sober invisible driver. When I see a piece of code someone else wrote, I know instantly if there are pointers around. I have to investigate closely to see if there are references. – Ben Voigt Jan 18 '11 at 01:44
  • @Ben: My only point with C# was that they chose not to support temporary ref parameters and returned ref values precisely because they're just asking for trouble (that is, until version 4, when they suddenly decided to add support for the former for some reason I can't figure out). If you avoid troublesome behavior like this (which I've learned to do from experience), references suddenly make things easier. But if you purposefully engage in dangerous behavior then obviously nothing will make things safe for you. – user541686 Jan 18 '11 at 01:48
  • @Ben: What's so different about pointers and references that causes them to be so different in visibility (and hence your invisible driver)? They're both documented in method signatures, aren't they? So if a method returns a reference, isn't it fairly obvious that the first thing you should do is check the lifetime of what it points to? How's that different from a pointer? – user541686 Jan 18 '11 at 01:50
  • @Lambert: There's no implicit conversion from variable to pointer, that's the visibility difference. Checking the signature is not required. You seem to think I have a strange approach to using references, but yours is out of touch to at least the same extent, C++ const references are designed to bind r-values and that's probably the most common use of references, yet you insist that a function with a reference parameter requires an l-value. I solve that by using a pointer where an l-value is required, it not only solves the problem but the language helps enforce it. – Ben Voigt Jan 18 '11 at 02:30
  • I thought the most common use of a reference was so you can modify the original value (or, at the very least, to avoid copying a large structure)? If you pass an r-value aren't you kind of defeating the whole point of a reference? – user541686 Jan 18 '11 at 02:41
  • @Lambert: The use where a reference is most necessary is so you can say `obj1 = obj2;` instead of `obj1 = &obj2;`. Next most common: to avoid copying a large structure. But remember that pass-by-value allows either an l-value or a temporary to be supplied. The optimization of avoiding a copy would be nearly useless if it lost that flexibility. Returning a reference is handy to allow modification of an element of a larger structure in-place (e.g. `operator[]`), and callers shouldn't ever assume they can store that returned reference long-term. Any long-term alias should be a pointer. – Ben Voigt Jan 18 '11 at 04:06
  • @Ben: What I don't understand is, why not just say `int x = 5; f(x);` instead of `f(5)`? My entire point is that, instead of temporarily making the compiler allocate the x for you, you should always do it yourself -- and if you do, your code doesn't break. You said that passing r-values as references is necessary to prevent extra copying in this, but I'm failing to see how... could you elaborate on that? – user541686 Jan 18 '11 at 04:08
  • [Here's a question](http://stackoverflow.com/questions/4633767/how-to-return-a-reference-in-c) where the experts had to hammer home the point about returning a pointer when ownership is being handed over. – Ben Voigt Jan 18 '11 at 04:08
  • @Lambert: I didn't say that passing r-values as references was necessary to avoid the copy. I said that if the no-copy optimization conflicted with the ability to pass an r-value, programmers would revolt and no one would use the no-copy optimization. After all, `int x = 5; f(&x);` works just as well. The big benefit of accepting an input-only parameter by const reference (instead of pointer-to-const) is that it's nearly 100% compatible with pass-by-value usage. – Ben Voigt Jan 18 '11 at 04:12
  • The funny thing is, I *agree* with you that references shouldn't be returned. I said that many times: returning references, and passing r-values as references, are generally bad practice. As long as you don't engage in those (with, I should add, the major exception of when you're overloading operators, since in that case you don't have many other options, and it's usually clear in those cases what the object lifetimes are), references are much safer than pointers. – user541686 Jan 18 '11 at 04:13
  • @Ben: "The big benefit of accepting an input-only parameter by const reference is that it's nearly 100% compatible with pass-by-value usage." --> Assuming you're talking about the fact that it *looks* just like pass-by-value (if not, then correct me), how is that a benefit of any sort? All it does it open up huge potential for problems like in your example, for no good reason. – user541686 Jan 18 '11 at 04:16
  • @Lambert: It means that you can retrofit a function from pass-by-value to pass-by-const-reference without having to modify every call site. As for my example, it violates my advice that returned references are to be used immediately, never stored (that advice prevents the ugly problems with the example in my answer as well). – Ben Voigt Jan 18 '11 at 04:25
  • @Lambert: And of course I meant that the benefit is that it's compatible AND avoids copying a large object. As opposed to pass-by-pointer-to-const which also avoids the copy, but isn't compatible syntax and therefore not conducive to retrofitting existing functions. – Ben Voigt Jan 18 '11 at 04:27
  • @Ben: I understand that it voids copying the large object, but doesn't `int x = 5; f(&x);` also avoid copying `x`? What's the difference? (Also, the issue with the call-site incompatibility is a good thing: If you change your design so much so as to change byref into byval, I really think the compiler had better *force* you to take another look at *every* call... I actually *don't* like the cross-compatibility.) – user541686 Jan 18 '11 at 04:36
  • 1
    @Lambert: I can't stand having the same syntax for mutable reference parameters as copied parameters, which is why I avoid mutable reference parameters like the plague. But read-only input parameters are another matter entirely, there's rarely any reason not to just read them out of the caller's copy. Changing pass-by-value into pass-by-const-reference isn't a design change, it's a minor optimization. OTOH changing an input parameter into an in/out parameter is a significant change (refer back to first sentence). – Ben Voigt Jan 18 '11 at 05:06
  • @Lambert: Also I'm assuming that the function doesn't store the address of its pass-by-const-reference parameter (in any location that outlives the call). That also would be a significant design change and call for using a pointer instead, with the corresponding change to call syntax. – Ben Voigt Jan 18 '11 at 05:08
  • @Ben: Ah, but there's a subtle problem with what you're saying. Read-only input parameters are *only* really an optimization if your struct is large. And if your struct is large, you're almost never going to create it right inside the argument list -- it's virtually always already stored in a variable, and you have to pass the variable name. So your example of passing a new struct (or a literal) as a const reference really doesn't ever come up in practice. Can you think of a *practical* example of when it would be done, giving a good reason why it's a const reference and not pass-by-value? – user541686 Jan 18 '11 at 05:13
  • @Ben: In other words (in case that didn't make sense), what I mean is, in any practical scenario, you wouldn't be passing a const reference as an r-value, because why would you? If it's for optimization, then passing a primitive type isn't really an optimization; if it's for optimizing large structs, why would you ever make a *temporary* struct in the first place? It's already going to be given to you in a variable (and if not, creating it would be expensive, so you'd be caching it globally and passing a global reference, not a local reference to a temp object). Your example isn't realistic... – user541686 Jan 18 '11 at 05:18
  • @Lambert: I think `std::string` might qualify. It's often large enough to want to save the copy, yet also often created as a temporary. True that you probably don't want to create a temporary large enough to worry about copying, but there are significant advantages to having one function be able to accommodate both, especially when it's wrapped in a reusable library. How many levels deep are you going to create duplicate functions, one which passes `std::string` by value and accepts temporaries, and one which passes `std::string` by reference to avoid the copy? Better to have a unified fn. – Ben Voigt Jan 18 '11 at 07:27
  • @Ben: My point was, the situation occurs infrequently enough that it's not much trouble for the programmer to make his own temporary variable by hand every once in while. I didn't say you should make two functions, but that the situation isn't often enough to justify trading safety for ease in passing parameters. – user541686 Jan 18 '11 at 15:04
2

A reference is always initialized from an existing object, thus it can never be NULL, whereas a pointer variable is allowed to be NULL.

EDIT: Thanks for all of the replies. Yes, a reference can indeed point to garbage, I forgot about dangling references.

Chris O
  • 5,017
  • 3
  • 35
  • 42
  • 2
    A reference can be initialized from a pointer, so it really isn't any safer. – Ben Voigt Jan 17 '11 at 17:05
  • 1
    This isn't necessarily true. A reference can be initialized from a dereferenced pointer, in which case if the pointer is null an exception will be thrown. – Beanz Jan 17 '11 at 17:06
  • @Chris: no it will not. It is **unexpected behaviour**. – Alexandre C. Jan 17 '11 at 17:07
  • @Ben Voigt: It is. Let's say we two knives: one with and one without plastic cover. We can cut ourselves with both knifes, but with the second one we first have to remove the plastic cover to cut ourselves. So the second one is safer. – orlp Jan 17 '11 at 17:09
  • @nightcracker: Except that it's trivially easy to get an invalid reference, even without using pointers. – Ben Voigt Jan 17 '11 at 17:10
  • 1
    @Alexandre C.: No, it is not. It is **undefined behaviour**. – Lightness Races in Orbit Jan 17 '11 at 17:12
  • @Chris: wtf? An exception? Are you mad?! You may be confusing exceptions with OS-specific access violations, which some OSs might call "exceptions". – Lightness Races in Orbit Jan 17 '11 at 17:12
  • 1
    @nightcracker: Only by letting an object fall out of scope! Yes, it's easy to be left with a dangling reference. – Lightness Races in Orbit Jan 17 '11 at 17:13
  • @Alexandre you are correct. I've been working in windows too long where null pointer dereferencing is an exception. According to the C++ standard it results in undefined behavior, usually an OS fault. – Beanz Jan 17 '11 at 17:15
  • 2
    Practically speaking (i.e. on all major compilers) binding a reference by dereferencing a NULL pointer won't cause an access violation, segmentation fault, null pointer exception, or any behavior different from copying a NULL pointer to another pointer. Bad things won't happen until access is made through the reference. A reference is basically an automatically dereferenced pointer, at least in real life. And this is perfectly compliant to the standard... UB means the program can fail immediately, work, or fail later in some obscure unexpected way. – Ben Voigt Jan 17 '11 at 17:22
2

It is a little safer, but not the same thing. Note that you have the same problems of "dangling references" as with "dangling pointers". For instance, returning a reference from a scoped object yields undefined behaviour, exactly the same as pointers:

int& f() { int x = 2; return x; }

The only benefit is that you cannot create a null reference. Even if you try hard:

int& null_ref = *((int*)0); // Dereferencing a null pointer is undefined in C++
                            // The variable null_ref has an undefined state.

As class members, pointers are preferred since they have better assignment semantics: you cannot reassign a reference when it has been initialized. The compiler won't be able to provide a default assignment operator if there are reference members in the class.

Therefore, C++ cannot get rid of pointers, and you can use them freely: by passing arguments as pointers and not as (non const) references, you make it clear at the call site that the object will be modified. This can add a little safety, since you see by naked eye what functionsindeed modify objects.

I play a little the devil's advocate, but references are easy to abuse.

Alexandre C.
  • 55,948
  • 11
  • 128
  • 197
  • I don't see how the story in the first paragraph makes them less "safe". Just more irritating. – Lightness Races in Orbit Jan 17 '11 at 17:07
  • @Tomalak: yes, it is a little off topic, but I must confess I read quickly the question. – Alexandre C. Jan 17 '11 at 17:09
  • "Dereferencing a null pointer is undefined in C++ so the program is ill formed." That's wrong, and doesn't make any sense. That line is perfectly well-formed (a measurement made of the source code at *compile-time*), but its result is undefined (a measurement made of the running environment at *run-time*). A run-time measurement has no bearing on the compile-time measurement, unless you're proposing taking "anything can happen in UB" a bit too far (time machine?). – GManNickG Jan 17 '11 at 20:10
  • @GMan: you're right, but the contents of the `null_ref` are undefined. – Alexandre C. Jan 17 '11 at 20:57
  • @Alexandre: You could say that, but undefined behavior is the end-all result. `null_ref`'s value is unknown, sure, but the state of the entire program is unknown. – GManNickG Jan 17 '11 at 21:04
2

Because references must always be initialized and since they must refer to an existing object, it is much harder (but by no means impossible) to end up with dangling references than it is to have uninitialized/dangling pointers. Also, it's easier to manipulate references because you don't have to worry about taking addresses and dereferencing them.

But just to show you that a reference by itself doesn't make your program 100% safe, consider this:

int *p = NULL;
int &r = *p;
r = 10; /* bad things will happen here */

Or this:

int &foo() {
  int i;
  return i;
}

...

int &r = foo();
r = 10; /* again, bad things will happen here */
casablanca
  • 69,683
  • 7
  • 133
  • 150
0

Well the answer you point out answer that. From the "safer" point of view I think that basically it is hard to write code like :

int* i;
// ...
cout << *i << endl; // segfault

As a reference is always initialized, and

MyObject* po = new MyObject(foo);
// ...
delete po;
// ...
po->doSomething(); // segfault

But as said in the question you mention, that's not only because they are safer that references are used ...

my2c

neuro
  • 14,948
  • 3
  • 36
  • 59
0

Suppose you didn't have the reference operator <type>& in the language. Then whenever you wanted to pass a reference to an object to a function you would have to do what's done in C, pass the argument as &<myobject> and receive it in the function as a pointer parameter <type>* p. Then you would have to discipline yourself not to do something like p++.

That's where the safety comes in. Passing by reference is very useful, and having the reference operator avoids the risk that you'll modify the pointer.

(Maybe a const pointer would accomplish the same thing, but you've got to admit the & is cleaner. And it can be applied to other variables besides parameters.)

Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
-1

A pointer is an independent variable that can be reassigned to point to another date item, unintialized memory, or just no where at all (NULL). A pointer can be incremented, decremented, subtracted from another pointer of the same type, etc. A reference is tied to an existing variable and is simply an alias for the variable name.

ThomasMcLeod
  • 7,603
  • 4
  • 42
  • 80
  • A reference can point into memory just like a pointer, you can create a reference out of a pointer, so anything that would invalidate the pointer also invalidates the reference. – Ben Voigt Jan 17 '11 at 17:12
  • 2
    @ThomasMcLeod: That a reference is an alias is a nice dream, but also a sad and common misconception. – Lightness Races in Orbit Jan 17 '11 at 17:15
  • @Ben: A reference is not a pointer and does not point into memory anymore than any variable name points into memory. When I declare int i; does "i" "point into memory"? Well, i as a symbol represents a memory address, but it's still not a pointer. Same with a reference. Furthermore, you cannot initialize a reference to T with a pointer to T, unless you dereference the pointer, at which point you are not initializing with a T *, but simply a T. – ThomasMcLeod Jan 17 '11 at 17:45
  • @ThomasMcLeod: Of course a reference points into memory. How else would it be possible to have references as class members and then use them in an entirely different part of the program? – Ben Voigt Jan 17 '11 at 17:50
  • @Ben: There is confusion between the semantics of data and the meaning of symbols in C++. When declaring: int i; does "i" point into memory? In a general sense, symbols are memory addresses, including the "i" in "int i = 0;" and the "j" in "int & j = i;". The distinction is the meaning of the data at the memory address. For both i & j, the data behind the symbol is "0". However, in the case of "int * pi = & i;" the data behind symbol pi is a memory address, not the value of i, which is "0". – ThomasMcLeod Jan 17 '11 at 18:12
  • @Ben: Some compiler designers may chose to internally represent references as memory addresses, which require dereferencing, but this is a design choice for the convenience of the compiler designer. (This is apparently the case for the questioner's compiler.) Even if true, this does not change the function or use of references. They still fucntionally act as if they were data objects and not pointers. So I don't think it's correct to say the a reference points into memory. – ThomasMcLeod Jan 17 '11 at 18:30
  • @Tomalak: The concept of reference as alias does not come from me, but right out of the popular C++ text by Dietel & Dietel. The book states "Reference variables can be used as local aliases within a function. . . . All operations supposedly performed on the alias is actually performed on the variable." So in what way is this a "sad and common misconception." – ThomasMcLeod Jan 17 '11 at 18:48
  • @ThomasMcLeod: "can be used as" falls far short of promising a comprehensive explanation. The `|` operator can be used to make a letter uppercase. Does that means that every time you see it, a string is being converted to uppercase? – Ben Voigt Jan 17 '11 at 19:28
  • It should be perfectly clear that references DO NOT act as if they were data objects. When a data object leaves scope, a destructor is called. When a reference leaves scope, no destructor is called (except for the special case of const-reference-induced lifetime extension of temporary variables). – Ben Voigt Jan 17 '11 at 19:30
  • @Ben: Agreed that there are minor functional differences between a T & and a T, as you point out. But these differences do not involve indirection or dereferencing, as is the case with pointers. So in this sense - how references appear to the programmer to access underlying data - references act as if they were data objects. This is not the case with pointers. – ThomasMcLeod Jan 17 '11 at 19:54
  • @ThomasMcLeod: "Minor functional differences"? As for D&D, apparently that book is also spreading that sad misconception; I guess it had to come from somewhere. I wish that references were simply variable aliases: I used to think that they are, and semantically they can be seen to be. But practically? Absolutely not. – Lightness Races in Orbit Jan 17 '11 at 22:53
  • @ThomasMcLeod: whether or not the object destructor is called is a minor difference? On could say it's a much more important difference then whether it is explecitely (pointers) or implecitely(references) dereferenced (and there is dereferencing/indirection involved, how else convey the location of the actual object to the code using the reference?), afterall the later is catched by the compiler, while the former usually leads to late night debugging. Thus references act much more like pointers then objects, with the minor difference that must be initialized. – Grizzly Jan 17 '11 at 22:57
  • @Grizzly: Pointers are variables that exist independently of any actual data object (unless you consider an address a data object). References are bound to a normal data object in memory, just like any other variable declaration is bound to a data object in memory. Unlike a pointer, this binding cannot be manipulated, evaluated or modified. References are a secondary binding, not the primary one. That is way they can go out of scope and the object still exists. But they are still a direct, fixed binding. However, pointers bind to nothing but an address. So references are not like pointes. – ThomasMcLeod Jan 17 '11 at 23:27
  • @ThomasMcLeod: Nearly everything you just said is true of a const pointer. So references are like pointers. And we haven't even touched on polymorphism, which applies to both references and pointers but not data objects. – Ben Voigt Jan 18 '11 at 01:49
  • @Ben: if you're refering to pointer constants, like int * const pi = & k;, these are not like references, because they are still bound to addresses, not data objects. I concede that, for the purposes of polymorphic access to objects, references act like pointers. – ThomasMcLeod Jan 18 '11 at 02:50
  • @ThomasMcLeod: It's impossible to distinguish between binding to an address and binding to an object without the ability to relocate objects. I'm not sure what the standard says about it, but I'm fairly confident that actual implemented behavior is that references bind to storage locations (in effect addresses) rather than objects, which would be observable if a new object of identical type was constructed in the same location. – Ben Voigt Jan 18 '11 at 03:51
  • And I found what the standard says about it: I'll add the quote to my answer, since it cannot fit here. – Ben Voigt Jan 18 '11 at 03:55