3

Is it undefined behavior if I go through the elements of a 2D array in the following manner?

int v[5][5], i;

for (i = 0; i < 5*5; ++i) {
     v[i] = i;
}

Then again, does it even compile? (I can't try it right now, I'm not at home.) If it doesn't, then imagine I somehow acquired a pointer to the first element and using taht instead of v[i].

Paul Manta
  • 30,618
  • 31
  • 128
  • 208

5 Answers5

8

Accessing elements of a multidimensional array from a pointer to the first element is Undefined Behavior (UB) for the elements that are not part of the first array.

Given T array[n], array[i] is a straight trip to UB-land for all i >= n. Even when T is U[m]. Even if it's through a pointer. It's true there are strong requirements on arrays (e.g. sizeof(int[N]) == N*sizeof(int)), as mentioned by others, but no exception is explicitly made so nothing can be done about it.

I don't have an official reference because as far as I can tell the C++ standard leaves the details to the C89 standard and I'm not familiar with either the C89 or C99 standard. Instead I have a reference to the comp.lang.c FAQ:

[...] according to an official interpretation, the behavior of accessing (&array[0][0])[x] is not defined for x >= NCOLUMNS.

Rob Paisley
  • 437
  • 1
  • 3
  • 13
Luc Danton
  • 34,649
  • 6
  • 70
  • 114
  • 1
    C99 explicitly states it as an example of UB to even perform the pointer addition. – R.. GitHub STOP HELPING ICE May 16 '11 at 13:35
  • The C FAQ is not helpful, it gives an irrelevant reference to 6.5.5.2, a section that is only concerned with the subscripting syntax of multi-dimensional arrays, and not how they are accessed. This is why you should only cite the standard. – Lundin May 16 '11 at 14:30
  • Clarification: To access a multi-array this way is probably not valid syntax, but accessing it through &array[0][0])[x] may or may not be valid syntax. – Lundin May 16 '11 at 14:36
  • @Lundin My answer isn't concerned with the exact code that appears in the question, but the situation described: "imagine I somehow acquired a pointer to the first element and using taht [...]". – Luc Danton May 16 '11 at 14:45
4

It will not compile.

The more of less equivalent

int v[5][5], *vv, i;

vv = &v[0][0];
for (i = 0; i < 5*5; ++i) {
     vv[i] = i;
}

and

int v[5][5], i;

for (i = 0; i < 5*5; ++i) {
     v[0][i] = i;
}

will compile. I'm not sure if they are UB or not (and it could in fact be different between C90, C99 and C++; aliasing is a tricky area). I'll try to find references one way or the other.

AProgrammer
  • 51,233
  • 8
  • 91
  • 143
  • I think it's totally fine, as all multi-dimensional arrays are consecutive in the memory(and this is guaranteed) and using `operator[]` for arrays is just pointer arithmetic. – Kiril Kirov May 16 '11 at 09:32
  • 1
    I'd not be surprised that my first example is defined and the second UB: you are leaving the boundary object. On most current targets it won't make a difference but on x86 huge model (16 bits int, 64 K segment) in the case int[32000][32000] one may detect that int[0] is in one segment and not update the segment part when computing int[0][i] and thus fails for value of i > 32765 (obviously, you'll need unsigned or long index to get that on that platform). – AProgrammer May 16 '11 at 09:55
  • 1
    In C++ that is UB. but if you do `vv++;` and access by `*vv` then that is fine. You can only step one each time. – Johannes Schaub - litb May 16 '11 at 10:53
  • @Kiril Where is that guaranteed? Could you cite the relevant part of the C/C++ standard? – Lundin May 16 '11 at 14:34
  • 1
    @Lundin - I don't have a copy of the standard, but found something - in ISO/IEC 14882:2003(E) (don't know what "(E)" means): 8.3.4 Arrays: "An object of array type contains a **contiguously allocated** non-empty set of N sub-objects of type T" – Kiril Kirov May 16 '11 at 16:03
  • @Kiril A similar text can be found in C99 6.2.5 §20. But it doesn't mention multi-arrays at all, unless it should be interpreted "recusively": a multi-dimensional array is an array of arrays. – Lundin May 17 '11 at 06:15
  • @Lundin - I'm pretty sure that this is the key (: – Kiril Kirov May 17 '11 at 06:16
  • Then it find the UD listed in the Annex of the C standard quite strange (see the cited text in my post). Why wouldn't you be allowed to address the multi-array out of bounds if you knew there were valid objects behind it? I guess it could somehow be related to OS virtual memory, but that seems far-fetched. – Lundin May 17 '11 at 06:19
3

It is really quite hard to find any reference in the standard explicitly stating that this is undefined behavior. Sure, the standard clearly states (C99 6.5.6 §8-9) that if you do pointer arithmetics beyond the array, it is UB. The question then is, what is the definition of an array?

If a multi-dimensional array is regarded as an array of array objects, then it is UB. But if it is regarded as one array with multiple dimensions, the code would be perfectly fine.

There is an interesting note of another undefined behavior in Annex J of the standard:

An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression a[1][7] given the declaration int a[4][5]) (6.5.6).

This insinuates that accessing a multi-dimensional array out of the range of the 1st dimension is undefined behavior. However, the annex is not normative text, and 6.5.6 is quite vauge.

Perhaps someone can find a clear definition of the difference between an array object and a multi-dimensional array? Until then, I am not convinced that this is UB.

EDIT: Forgot to mention that v[i] is certainly not valid C syntax. As per 6.5.2.1, v[i] is equivalent to *(v+i), which is an array pointer and not an array element. What I am not certain about is whether accessing it as v[0][too_large_value] is UB or not.

Lundin
  • 195,001
  • 40
  • 254
  • 396
2

Here v[i] stands for integer array of 5 elements.. and an integer array is referenced by an address location which depending on your 'c' compiler could be 16 bits, 32 bits...

so v[i] = i may compile in some compilers.... but it definitely won't yield the result u are looking for.

Answer by sharptooth is correct v[i][j] = i... is one of the easiest and readable solution..

other could be

int *ptr;
ptr = v;

now u can iterate over this ptr to assign the values

for (i = 0; i < 5*5; i++, ptr++) {
     *ptr = i;
}
Kiril Kirov
  • 37,467
  • 22
  • 115
  • 187
Nik
  • 695
  • 1
  • 4
  • 15
0

This will not compile.

You will get the following error for the line:

v[i] = i;

error: incompatible types in assignment of ‘int’ to ‘int [5]’

To give an answer taken from a similar question at:

http://www.velocityreviews.com/forums/t318379-incompatible-types-in-assignment.html

v is a 2D array. Since you are only referencing one dimension, what you end up getting is a char pointer to the underlying array, and hence this statement is trying to assign a char constant to a char pointer. You can either use double quotes to change the constant to a C-style string or you can explicitly reference v[i][0] which is what I assume you intended.

Steve Walsh
  • 6,363
  • 12
  • 42
  • 54