1

I would like to save typing in some loop, creating reference to an array element, which might not exist. Is it legal to do so? A short example:

#include<vector>
#include<iostream>
#include<initializer_list>
using namespace std;
int main(void){
    vector<int> nn={0,1,2,3,4};
    for(size_t i=0; i<10; i++){
        int& n(nn[i]); // this is just to save typing, and is not used if invalid
        if(i<nn.size()) cout<<n<<endl;
    }
};

https://ideone.com/nJGKdW compiles and runs the code just fine (I tried locally with both g++ and clang++), but I am not sure if I can count on that.

PS: Neither gcc not clang complain, even when compiled+run with -Wall and -g.

EDIT 2: The discussion focuses on array indexing. The real code actually uses std::list and a fragment would look like this:

std::list<int> l;
// the list contains something or not, don't know yet
const int& i(*l.begin());
if(!l.empty()) /* use i here */ ;

EDIT 3: Legal solution to what I was doing is to use iterator:

std::list<int> l;
const std::list<int>::iterator I(l.begin()); // if empty, I==l.end()
if(!l.empty()) /* use (*I) here */ ;
eudoxos
  • 18,545
  • 10
  • 61
  • 110
  • Is this relevant? http://stackoverflow.com/questions/988158/take-the-address-of-a-one-past-the-end-array-element-via-subscript-legal-by-the – doctorlove Jun 15 '13 at 09:39
  • @doctorlove: I read that, I think the difference is that they talk about getting the address. There seems to be disagreement in the answers below whether creating a reference actually entails dereferencing (the "hidden pointer", though references might be implemented in a different way) or not. Compilers seem to deal with that just fine (they don't read it), but the standard, as it seems, does not allow it. – eudoxos Jun 15 '13 at 11:00

3 Answers3

3

No it's not legal. You are reading data out of bounds from the vector in the declaration of n and therefore your program have undefined behavior.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • I doubt you can call declaration "reading data". What do you mean by "undefined behavior"? I think it is perfectly defined that as long as the reference is valid, the value is printed, and otherwise nothing is printed. – eudoxos Jun 15 '13 at 09:30
  • 2
    @eudoxos The point is, your assumption that the implementation must provide such a thing as an "invalid reference" - which is benign so long as you don't touch it - is groundless. Undefined behaviour means that the implementation is (for example) within its rights to simply crash as soon as you index past the end of a vector, or dereference `begin()` of an empty container - neither of which is the same as actually dereferencing the reference you're declaring. – anton.burger Jun 15 '13 at 11:57
0

I'd be surprised if this is "allowed" by the specification. However, what it does is store the address of an element that is outside the range of its allocation, which shouldn't in itself cause a problem in most cases - in extreme cases, it may overflow the pointer type, which could cause problems, I suppose.

In other words, if i is WAY outside the size of nn, it could be a problem, not necessarily saying i has to be enormous - if each element in the vector is several megabytes (or gigabytes in a 64-bit machine), you can quite quickly run into problems with address range.

But don't ask me to quote the specification - someone else will probably do that.

Edit: As per comment, since you are requesting the address of a value outside of the valid size, at least in debug builds, this may well cause the vector implementation to assert or otherwise "warn you that this is wrong".

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
0

No, for two reasons:

  1. The standard states (8.3.2):

    A reference shall be initialized to refer to a valid object or function

  2. std::vector::operator[] guarantees that even if N exceeds the container size, the function never throws exceptions (no-throw guarantee, no bounds checking other than at()). However, in that case, the behavior is undefined.

Therefore, your program is not well-formed (bullet point 1) and invoke undefined behaviour (bullet point 2).

GManNickG
  • 494,350
  • 52
  • 494
  • 543
Damon
  • 67,688
  • 20
  • 135
  • 185
  • Fair enough for 1. For 2, can you precise where is the behavior undefined? The reference is never used. – eudoxos Jun 15 '13 at 10:09
  • 3
    @eudoxos If the behaviour of `operator[]` is undefined, it doesn't matter whether you do anything with the result, it's not even guaranteed to return. –  Jun 15 '13 at 10:16
  • The second point is not related to the reference itself, but to the way you initialize it. You invoke `vector::operator[]` which operates identically to `vector::at`, except for not checking bounds. `vector::at` may throw whereas `operator[]` does not check bounds and is guaranteed not to throw. However, invoking `vector::operator[]` with an out-of-bounds value, whether or not you do something with the result, is UB. It will probably "work" anyway, but it is not correct. – Damon Jun 15 '13 at 10:16
  • 1
    @eudoxos Turn the question around. In the case where `n` > `size()`, can you tell me what you expect `operator[]` to return, and why? Similarly for your edit 2 - what should `*begin()` return, and why, if the list is empty? It's undefined because there are no guarantees about what will happen. It could crash, it could assert, it could throw, it could return a reference to some meaningless value which could even be dereferenced successfully. The fact that you aren't dereferencing it doesn't matter. – anton.burger Jun 15 '13 at 12:15
  • This answer is close to being right, but uses some bad wording. The program is well-formed, the behavior is just undefined. Also, in bullet two you say "even if N exceeds the container size, the function never throws..." but then turn around and say, "it's also undefined behavior". If it's undefined behavior, anything can happen! Including throwing exceptions... Guarantees only hold while the program has well-defined behavior. – GManNickG Jun 17 '13 at 21:48
  • @GManNickG: The wording is correct. A program that does not initialize a reference to a valid value _is not well-formed_. The standard explicitly clarifies this in the note that follows saying "in particular no well formed program can therefore contain a null reference" (written down from memory, one or two words might be different). About `operator[]`, it is guaranteed not to throw. While it's in principle true that UB could mean "anything", assuming that it might throw anyway doesn't make sense. It doesn't really matter for the question though, because in either case UB is triggered. – Damon Jun 18 '13 at 07:18
  • The wording "shall" (not _should_ or _might_) is also very clear in respect of a program being well-formed. A program that doesn't follow a "shall" constraint isn't well-formed. – Damon Jun 18 '13 at 07:21
  • @Damon: The note (which is non-normative, by the way) says: "in particular, a null reference cannot exist in a well-**defined** program". I will accept that the wording you quoted is grouped into other "shalls" that fall into well-formdedness, though one could argue the only way to break this particular "shall" is at runtime with undefined behavior (such as this question). But do note that "shall" can be followed by "or the behavior is undefined", it is not always strictly an indicator of well-formdedness. There are many ways a well-formed programs can break a "shall" and have UB. – GManNickG Jun 18 '13 at 15:35