3

With my compiler at least, creating a reference implies no dereferencing. Therefore, code like the following works:

int trivialExample(char* array, int length)
{
    char& ref = array[6];
    if (length > 6)
    {
        std::cout << ref << std::endl;
    }
}

That is, given a char array and its length (and assuming a bunch of trivialities like the array elements are all initialized and the passed length is correct), it will print the seventh character only if it actually exists.

Is this relying on undefined behavior?

Dan
  • 12,157
  • 12
  • 50
  • 84
zneak
  • 134,922
  • 42
  • 253
  • 328
  • A reference is ultimately just a pointer under the hood for most compilers, so it ought to work, but I'd say it's a 'bad idea' in your example, since there will be the temptation to use the reference outside of the length > 6 block, which could crash. – vercellop Oct 01 '11 at 18:32
  • @zneak You must always validate the index in C++, since there is automatic checking. So, you have to put your `char& ref` definition under your `if`. –  Oct 01 '11 at 18:37
  • @Mr.DDD C++ has automatic checking? Do you mean that is has _no_ automatic checking? Also, your rationale is that I'm shooting myself in the foot by doing this. If I want to really carefully aim between my toes but still not fall into undefined behavior, am I good with this? – zneak Oct 01 '11 at 18:42
  • 2
    @vercellop: Creating a pointer that points more than one past the end of an array *is* undefined behavior. – fredoverflow Oct 01 '11 at 18:44

5 Answers5

6

Actually, this is conceptually (and practically) not different than the following:

int trivialExample(char* array, int length)
{
    char *ptr = &array[6];
    if (length > 6)
    {
        std::cout << (*ptr) << std::endl;
    }
}

My educated guess is that you intend to call it this way:

char buffer[4];
trivialExample(buffer, sizeof(buffer));

And in C++, as in C, just obtaining a pointer to outside of the declared array (other than the next-to-last) invokes undefined behaviour, even if not dereferenced.

The rationale is that there may be (are?) architectures that faults just by loading an invalid address in a CPU register.

UPDATE: After some research, and hints from other SO users, I've becomed convinced that C++ does not allow to take a reference outside of the declaring object, not even to the next-to-last element. In this particular case, the results are the same, except for the element number 6, that would be allowed in the pointer version and not in the reference one.

rodrigo
  • 94,151
  • 12
  • 143
  • 190
  • I could verify that this is the case for at least C (section 6.5.6 "Additive Operators", point 8: _If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined._), but I don't have a copy of the C++ standard at hand. I'm ready to believe you, but if you had such a reference, I'd be much, much happier. – zneak Oct 01 '11 at 18:51
  • But isn't that what end() of STL containers does? Is STL vector ill formed? – selalerer Oct 01 '11 at 18:53
  • First of all, `char &ref = array[6]` are not *conceptually* same as `char *ptr = &array[6];`. The Standard doesn't say that. What you're talking about is implementation-detail; the Standard doesn't require that. – Nawaz Oct 01 '11 at 18:54
  • Specifically, I'd be looking for either the equivalent passage for pointer validity in the C++ standard and a passage that mandates the use of pointers for references, or a passage that tells my example is downright undefined behavior. – zneak Oct 01 '11 at 18:55
  • @selalerer, notice the (other than the next-to-last) exception. – zneak Oct 01 '11 at 18:56
  • I happen to have around the draft for the C++11: section 5.7 "Additive Operators", paragraph 5: If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. – rodrigo Oct 01 '11 at 19:00
  • @Nawas What I meant with _conceptually not different_ is that both codes it has the same meaning and more or less the same sequence of memory accesses. – rodrigo Oct 01 '11 at 19:18
  • It's not safe to reason about references by assuming that they act like pointers. They're very similar in some ways, but the only way to be sure is to see what the standard says. And in fact, there is a significant difference between pointers and references in this context; a pointer can point just past the end of an array, but a reference cannot. – Keith Thompson Oct 02 '11 at 02:16
2

The behavior is undefined.

Quoting from the C++ 2003 standard (ISO/IEC 14882:2003(E)), 8.3.2 paragraph 4:

A reference shall be initialized to refer to a valid object or function.

Which also implies that initializing a reference to just past the end of an array is undefined behavior, since there's no valid object there.

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
1
char& ref = array[6];

This is okay as long as the size of array is minimum 7. Otherwise it is undefined behavior (UB).

std::cout << ref << std::endl;

This is okay as long as array[6] is initialized or assigned with some value. Otherwise it is UB.

Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • So if `array` is not at least 7 elements long, creating the reference relies on undefined behavior? – zneak Oct 01 '11 at 18:31
  • @zneak. Yes. Definitely, it's UB, then. – Nawaz Oct 01 '11 at 18:31
  • Actually, I think that it would be defined behavior for an array of size 6, because the next-to-last element of an array is _addressable_, although not _referenciable_. – rodrigo Oct 01 '11 at 18:41
  • If merely defining a reference to a non existent element of an array is undefined then, considering array to pointer promotion, any code that creates a reference to an element in an offset from a pointer is undefined and that just doesn't compile with me. – selalerer Oct 01 '11 at 18:49
  • @rodrigo: No. If the size of array is `6`, then `(array+6)` is well-defined, but `*(array+6)` is undefined. – Nawaz Oct 01 '11 at 18:50
  • 1
    @Nawaz It seems that this [question](http://stackoverflow.com/questions/988158/take-the-address-of-a-one-past-the-end-array-element-via-subscript-legal-by-the) has a similar background, without the reference thing however. The debate seems inconclusive, although from a **very** strict point of view I agree with you, and reticently retract my previous comment... – rodrigo Oct 01 '11 at 19:20
1

I don't know about creating references, but accessing an element outside the bounds of the array is undefined behavior. So yes, your code is relying on undefined behavior.

K-ballo
  • 80,396
  • 20
  • 159
  • 169
  • He checks the length before accessing the array. – selalerer Oct 01 '11 at 18:30
  • @selalerer: No he doesn't, he checks the length right **after** accessing the array. – K-ballo Oct 01 '11 at 18:31
  • 1
    There's no access before the `std::cout` line. Creating a reference, at least on my compiler, does not dereference the pointer. – zneak Oct 01 '11 at 18:31
  • @zneak: What am I missing? There is that `array[6]` evaluation right before checking the length... – K-ballo Oct 01 '11 at 18:32
  • All the compilers I know implement references as pointers, so `char& foo = arr[6]` is equivalent to `char* foo = arr + 6`, and each access to `foo` is equivalent to dereferencing that address. – zneak Oct 01 '11 at 18:36
  • @zneak: "All the compilers I know" != "Standard C++" – K-ballo Oct 01 '11 at 18:38
  • In the naive implementation, that does some address arithmetic, but does not access any memory. So it can't throw, say, a segmentation fault. This doesn't mean it's safe, because there is no mandate to use the naive implementation. – dmckee --- ex-moderator kitten Oct 01 '11 at 18:38
  • @K-Ballo: This is precisely why I'm asking. Can you assert that standard C++ makes what I'm doing illegal? – zneak Oct 01 '11 at 18:40
  • @zneak: K-ballo did just that in his answer. Just because you don't want to believe him doesn't mean he is wrong. (He isn't). – David Hammen Oct 01 '11 at 19:49
-1

No, it is not undefined behavior. It is just like defining a pointer. A reference is a pointer with a friendlier and less error prone syntax.

selalerer
  • 3,766
  • 2
  • 23
  • 33
  • 1
    While this is a reasonable way to think about references you should be aware that the compiler is free to implement them in other ways. In particular references that share a scope may be nothing more than a alias in the symbol table during compilation and have *no* existence at all in the compiled program. – dmckee --- ex-moderator kitten Oct 01 '11 at 18:36
  • 2
    As others have pointed out, setting a pointer to point to a nonexistent object has undefined behavior. (There is one difference: you can set a pointer to just past the end of an array, as long as you don't dereference it, but you can't do the same with a reference.) – Keith Thompson Oct 02 '11 at 02:15