Why is the Index NOT out of bounds although it intuitively should?

Question

I'm relatively new to C programming and I stumbled upon a for me unexplainable behaviour while running the following code and debugging it using gdb and lldb.

In short: When swapping the indices i and j (max i != max j) when accessing a value in a two-dimensional Array inside a double nested for-loop it does not seem to matter if I access the value using array[i][j] or array[j][i].

The two loops and arrays are mostly identical.

 unsigned matrix[3][1] =
   {
       {3},
       {4},
       {5}
   };

   //Loop1
   for (size_t i = 0; i < sizeof(matrix) / sizeof(*matrix); i++)
   {
        for (size_t j = 0; j < sizeof(matrix[i]) / sizeof(*matrix[i]); j++)
        {
            matrix[i][j] <<= 1;
            printf("matrix[%zu][%zu]  has the value: %d\n", i, j, matrix[i][j]);
        }
   }

   //same two dimensional array as matrix
   unsigned matrix2[3][1] =
   {
       {3},
       {4},
       {5}
   };

   //Loop2, basically the same loop as Loop1
   for (size_t i = 0; i < sizeof(matrix2) / sizeof(*matrix2); i++)
   {
        for (size_t j = 0; j < sizeof(matrix2[i]) / sizeof(*matrix2[i]); j++)
        {
            //swapped i and j here
            matrix2[j][i] <<= 1;
            printf("matrix2[%zu][%zu]  has the value: %d\n", j, i, matrix2[j][i]);
        }
   }

Am I missing here something?

In both cases i is passed the value 2 at the end of the outer loop and j the value 0 at the end of the inner loop.

Intuitively, matrix[0][2] should throw an exception as each row only has one element.

C doesn't have exceptions, it has undefined behaviour. It's your responsibility to not index out of bounds. — Fredrik, May 11 '22 at 19:58
@Fredrik: The C standard does allow for exceptions, a.k.a. traps. It is up to implementations to support them or not. — Eric Postpischil, May 11 '22 at 20:00
you have the worst kind of (Undefined Behavior) - your code appears to work fine. — pm100, May 11 '22 at 20:12
@pm100 Is this UD? Isn't it defined in the standard that a multidimensional array is laid out as one-dimensional and that `[r][c]` results in 1-d-index: r * row-size + c? — meaning-matters, May 11 '22 at 20:20
You've muddied the waters by choosing `max j == 1`. So when you access `array[i][j]` the address calculation is `array_base_address + (i * 1 + j) * sizeof(int)`. When you swap the indexes, the math works out to the same address. But that's only because the second array dimension is 1. — user3386109, May 11 '22 at 20:24
@meaning-matters - i was wondering about that. I tried finding in the standards but could not. Not sure — pm100, May 11 '22 at 20:53
To demonstrate a situation where swapping the indexes may result in **observable** undefined behavior, the second array dimension needs to be larger than the first. For example consider `int array[3][100]`, `i=2`, and `j=99`. The indexing calculation for `array[i][j]` computes `2 * 100 + 99 = 299` which is inside the array, but `array[j][i]` computes `99 * 100 + 2 = 9902` which is well outside the array. — user3386109, May 11 '22 at 20:56

chameleon · Accepted Answer · 2022-05-11T20:19:10.580

6

I will take a slightly different approach than the other respondents.

You are technically not reading outside of the array's boundary as far as the memory layout is concerned. Looking at it from a human perspective you are (the index [0][2] doesn't exist!), but the memory layout of the array is contiguous. Each of the "rows" of the matrix are stored next to each other.

In memory, your array is stored as: | ? | 3 | 4 | 5 | ? | So when you index to matrix[1][0] or matrix [0][1] you are accessing the same position in memory. This would not be the case if your array was larger than 1 dimension wide.

For example, replace your array with the following one and experiment. You can access integer '4' either by indexing matrix[0][2], or matrix [1][0]. The position [0][2] shouldn't exist, but it does because the memory is contiguous.

unsigned matrix[3][2] =
   {
       {3, 6},
       {4, 8},
       {5, 10}
   };

edited May 11 '22 at 20:19

answered May 11 '22 at 20:12

chameleon

136
1
7

1

According to the highest-rated and accepted answer to [this question](https://stackoverflow.com/q/48219108/12149471), accessing a sub-array out of bounds invokes undefined behavior, even if the multi-dimensional array as a whole is not accessed out of bounds. However, this is debatable. I suggest that you also read the comments to that question, which leads you to a further question. – Andreas Wenzel May 11 '22 at 20:24
2

Yes, I agree with you. Technically speaking it is undefined behavior, and one should not get accustomed to this sort of programming. Other answers covered this part. I took the approach to explain to OP the behavior of the program, to hopefully answer the question of "why the index is not out of bounds". **Strictly** speaking, it is not out of the array boundary. – chameleon May 11 '22 at 20:32

score 3 · Answer 2 · answered May 11 '22 at 20:05

Oops, matrix[0][2] should throw an exception as each row only has one element...

Some languages do warn the programmer by an exception if they try an out of bound access, but C does not. It just invokes Undefined Behaviour. On a technical point of view, it means that the compiler does not have to test the out of bound condition. On an operational point of view, it means that anything can happen, including expected behaviour... or an immediate crash... or a modification of an unrelated variable... or...

wrong answer, in his case [0][2] and [2][0] are the same, he has a special case where index inverting has no effect — elhadi dp ıpɐɥןǝ, May 12 '22 at 14:01

Some random IT boy · Answer 3 · 2022-05-11T20:02:34.283

If my C skills aren't mega-rusty you're reading "unsafe memory".

Essentially your matrix is declared as a block of bytes. After that block of bytes there are more bytes. What are they? Usually more variables that are declared as your program's data. Once you reach the end of the program's data block you reach the user code memory block (encoded ASM instructions).

Most languages perform checks and throw an exception when you run out of bounds by somehow keeping track of the last index that is valid to access. C does not do that and doing such thing is your very own responsibility. If you aren't careful you might be overwriting important parts of your program's code.

There are attacks that one can perform on C programs that don't sanitize user input, like a buffer overrun; which exploits what it's been described.

Essentially if you declare a char[] of length N and store a string that comes from outside and this string happens to be of length N+X you'll be overwriting program memory (instructions).

With the right sequence of characters you can inject your very own assembly code into a running program which doesn't sanitize user input

Accessing outside the bounds of an array is **not** defined to let you “read and modify dangerous/unsafe memory.” The behavior is completely undefined by the C standard. — Eric Postpischil, May 11 '22 at 20:01

elhadi dp ıpɐɥןǝ · Answer 4 · 2022-05-12T15:01:17.623

As your array is int and all elements are of the same size, i don't see any problem as your array is stored in contiguous space in RAM and you use a special case of matrix where inverting indexes has no side effect.

In the first loop your indexes are [0][0], [1][0], [2][0]
In the second loop your indexes are [0][0], [0][1], [0][2]

now try to linear the access, as your array is saved as linear array into the RAM.

address of element = row * NCOL + col
row: is row number 
NCOL: number of columns into your matrix
col : the column number

so the linear index for :
[0][2] ==> 0 * 1 + 2 = 2 /* the third element*/ 
[2][0] ==> 2 * 1 + 0 = 2 /* always the third element */

But if you use a matrix of n x m , n >= 1 and m > 1 and n != m. if you inverse the indexes, the result will not be the same.

so if you take a 4 x 2 matrix

linear index of [3][1] = 3 * 2 + 1 = 7
linear index of [1][3] = 1 * 2 + 3 = 5  /* even [1][3] is out of the range of your matrix index */
[1][3] you will manipulate the element [2][1]

So be worry when manipulating matrix indexes.

Why is the Index NOT out of bounds although it intuitively should?

4 Answers4