1

I thought of a strange cheat in C++. Normally, I can't smuggle a reference out of a scope because I can't define an uninitialized reference in the containing scope. However, I can define a pointer to a class containing a reference, fail to initialize it, and then assign it the address of some dynamic memory initialized to a local variable. Even though that dynamic object contains a reference to a variable that is supposed to go out of scope, the pointed-to object still has a valid reference with the same value! g++ doesn't complain even if I tell it to be -pedantic, so I assume it's valid. But how, and why?

struct int_ref
{
  int &x;
  int_ref(int &i): x(i) {}
};

#include <iostream>
using namespace std;

int main(void)
{
  int_ref *irp;
  int i = 1;
  int_ref a(i); // Creates an int_ref initialized to i
  irp = &a; // irp is now a pointer to a reference!
  // Prints 1
  cout << "irp->x = " << irp->x << " (i = " << i << ")" << endl;
  i = 2;
  // Prints 2
  cout << "irp->x = " << irp->x << " (i = " << i << ")" << endl;
  int j = 3;
  int_ref b(j);
  irp = &b;
  // Prints 3
  cout << "irp->x = " << irp->x << " (i = " << i << ", j = " << j << ")" << endl;
  i = 1;
  // Still prints 3
  cout << "irp->x = " << irp->x << " (i = " << i << ", j = " << j << ")" << endl;
  {
    int k = 4;
    irp = new int_ref(k);
    // k goes out of scope
  }
  int k = 1; // Doesn't affect the other k, of course
  // Prints 4 ?!
  cout << "irp->x = " << irp->x << " (i = " << i << ", j = " << j << ")" << endl;
}

Edit: This may in fact be (as suggested in the answers) an undiagnosed dangling reference. What about if I define int_ref like this:

struct int_ref
{
  const int &x;
  int_ref(const int &i): x(i) {}
};

A const reference need not refer to an lvalue, so there is no well-defined concept of a dangling one. Is the code still undefined?

Ryan Reich
  • 2,398
  • 2
  • 17
  • 15

2 Answers2

4

What you have done has undefined behavior.

  {
    int k = 4;
    irp = new int_ref(k);
    // k goes out of scope
  }
  int k = 1; // Doesn't affect the other k, of course
  // Prints 4 ?! ***it could print 42 - no guarantees***
  cout << "irp->x = " << irp->x << " (i = " << i << ", j = " << j << ")" << endl;

You're keeping (and using) a reference to an object that has gone out of scope. It's undefined behavior regardless of how indirectly you keep that reference.

The worst manifestation of undefined behavior is seeming to work OK. In this case you have a false sense of security, but nasal daemons may fly all over the place, believe me ;)

As to why the compiler doesn't complain even in pedantic mode, well, it's very difficult to do such static analysis of the code to detect such things. So it's left for your own attention.

Armen Tsirunyan
  • 130,161
  • 59
  • 324
  • 434
  • There's a case to be made for the fact that, since we humans can see this is UB almost immediately, the compiler should be able to aswell. Of course the real reason is that UB doesn't need to be diagnosed. – Lightness Races in Orbit Jul 16 '11 at 18:38
  • @Tomalak: A good compiler does many things that it doesn't have to according to the standard. I think the real reason is that it's not so trivial – Armen Tsirunyan Jul 16 '11 at 18:40
  • I ask this simply because the language doesn't provide a normal mechanism for passing references out of scope, since you can't reassign a reference like you could a pointer (the other way to get this behavior). – Ryan Reich Jul 16 '11 at 18:43
  • 1
    @Ryan The language doesn’t allow this for a reason: to *protect* you against undefined behaviour due to dangling references. – Konrad Rudolph Jul 16 '11 at 18:47
3

Just because your compiler doesn't emit a diagnostic for your program doesn't mean that your program is good, valid and safe.

Your final line invokes Undefined Behaviour because you have a dangling reference. This does not need to be diagnosed by the compiler, but it also doesn't make it right. You could get 4 (just so happens to still be at that place in memory), you could get some other value, you could get a segmentation fault, or your PC could explode.

Just don't do this.


Terminology note: there's no "pointer to reference" (no such thing exists) here, only a pointer to an int_ref.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • @Armen: His code has a comment that says "irp is now a pointer to a reference!" :) – Lightness Races in Orbit Jul 16 '11 at 18:39
  • Thanks for your answer. I've suggested a variant on the question; could you look at the new paragraph at the bottom? (Also, re: the comment: I guess I was overenthusiastic. It does, as I say, only imitate one). – Ryan Reich Jul 16 '11 at 18:40
  • @Ryan: I believe it's still undefined. `const`ness of the reference elongates the lifetime of a temporary, but your `k` is not a temporary. – Lightness Races in Orbit Jul 16 '11 at 18:41
  • A reference is really just a fancy pointer, so your reference is pointing to the memory address that used to be occupied by k=4. It looks like the compiler did not use that memory address for anything else, so the value is still there. – Gabriel Jul 16 '11 at 18:57
  • @Tomalak As unreliable as it looks, Wikipedia says otherwise: "Because a reference is usually implemented as an underlying pointer, ..." http://en.wikipedia.org/wiki/Reference_(C%2B%2B), or http://stackoverflow.com/questions/3954764/how-is-reference-implemented-internally – Gabriel Jul 16 '11 at 19:05
  • @Gabriel: That statement is correct. But "is usually implemented as" is _not_ the same thing as "is a". The semantic differences are well-defined by the C++ standard. – Lightness Races in Orbit Jul 16 '11 at 19:07
  • @Tomalak You're right. Still, I was willing to enlighten Ryan with a practical plausible explanation to the behaviour. – Gabriel Jul 16 '11 at 19:17
  • @TomalakGeret'kal let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/1540/discussion-between-gabriel-and-tomalak-geretkal) – Gabriel Jul 16 '11 at 19:19