-1

In C, why can we access an array element whose index is out of range?

int arr[5];
printf("%d", arr[200]);

This gives a garbage value. Why is this?

Allan Wind
  • 23,068
  • 5
  • 28
  • 38
  • 1
    Accessing an array using an invalid index is UB (undefined behavor). Therefore anything can happen. – wohlstad Oct 11 '22 at 05:55
  • 1
    Welcome to the wonderful world of C. There is no error checking. When you break the rules of the language, you don't get a reliable error message; instead you get [undefined behavior](https://stackoverflow.com/questions/2397984/undefined-unspecified-and-implementation-defined-behavior). – Nate Eldredge Oct 11 '22 at 05:56
  • 2
    C don't "waste" time checking array bounds. It trust that you know what you are doing. And for "This gives a garbage value." Sometimes yes but it may also crash the program or do other things. It's undefined. – Support Ukraine Oct 11 '22 at 05:56
  • And what's your expectation? Do you expect the compiler to warn you? Are you expecting the program to fail at runtime? I would suggest using the `-Wall` flag while compiling. – babon Oct 11 '22 at 05:59
  • It's not garbage, it is data read from `*(arr + 200)`. It has both educational purpose to illustrate how the stack works on the particular system, which is sometimes exploited in platform specific code, and sometimes used to allow malloced structs with final member sized 1 to be used to store arbitrary number of items [like so](https://onlinegdb.com/IWqx_CrrZ), more practically see [msgbuf](https://tldp.org/LDP/lpg/node30.html). – Dmytro Oct 11 '22 at 06:26
  • Off-site duplicate: [What is undefined behavior and how does it work?](https://software.codidact.com/posts/277486) – Lundin Oct 11 '22 at 06:32
  • Please note that in this situation, `arr` has not been initialized, so even accessing an index between 0 and 4 will give an _indeterminate_ (possibly "garbage") value. – Chris Oct 11 '22 at 07:03
  • FWIW: Even C++ skips boundary checking for the common way of accessing vector-elements, i.e. operator[]. There is the at() function for element access with boundary checking but for performance the operator[] exists as well (and is, AFAIK, what is used in most cases). – Support Ukraine Oct 11 '22 at 07:29

3 Answers3

3

C does not enforce boundary checks on arrays. It was probably done for performance reasons at the time. Your program exhibits undefined behavior (UB) when it tries to access your array out of bounds. It would also be UB if the access was within bounds as the array is not initialized.

Some boundary checks can be done at compile time. For instance if you build your program with gcc -O2 -Warray-bounds 1.c:

#include <stdio.h>

int main(void) {
        int arr[5];
        printf("%d", arr[200]);
}

The compiler will warn you about this issue:

1.c: In function ‘main’:
1.c:5:2: warning: array subscript 200 is above array bounds of ‘int[5]’ [-Warray-bounds]
    5 |  printf("%d", arr[200]);
      |  ^~~~~~~~~~~~~~~~~~~~~~
1.c:4:6: note: while referencing ‘arr’
    4 |  int arr[5];
      |      ^~~

Other boundary checks would need to happen at run-time. If you build this program with gcc -fsanitize=bounds 2.c:

#include <stdio.h>

int main(void) {
        int i;
        scanf("%d", &i);
        int arr[5] = {0, 1, 2, 3, 4};
        printf("%d", arr[i]);
}

The program will now give you a run-time error for input 200 as it's out of bounds:

2.c:7:18: runtime error: index 200 out of bounds for type 'int [5]'
Allan Wind
  • 23,068
  • 5
  • 28
  • 38
1

C is now a rather old language. It was created in the 70's as the primary language to build the first Unix operating systems (kernel, libraries and commands). The main goal was to have something that was easy to efficiently translate in machine code, yet more or less portable.

For that reason, performance was much more important then robustness or portability. The rule was that the programmer had to know what it was writing, and the compiler should just blindly obey. And the notion of array was as simple as possible: just a start adress, the size being only used at declaration time to reserve memory... Not only the language did not control index validity, but it did not even carry the way to do such controls. This point is still in the philosophy of the language and is the reason why even in recent versions, an array decays to a pointer when it is used where a pointer could be.

Of course compilers are now much more user friendly and good ones warn the programmer for the common errors that can be detected at compiler time. But because of the philosophy of the language (and to avoid breaking legacy code...) those are only warnings and the compiler will do its best to translate it as closely as possible in machine code. Simply good programmers know that the standard says that code like that would invoke Undefined Behaviour, and do not write it...

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • Important detail you mention here: The language does not even have the means to know what size an array is, outside the scope of its declaration. A different translation unit gets only an address and has no means to check boundaries; consequently, even in the TU where the definition is visible such checks were not implemented. (But as Allan's answer shows, it can easily be done with gcc, and of course any program containing debug information could do that, too.) – Peter - Reinstate Monica Oct 11 '22 at 07:11
0

In C, reading an array element means reading the bytes at the offset corresponding to the index:

In your example, arr is defined as an array of 5 int: to compile printf("%d", arr[200]), the compiler generates code that computes the address of an int at index 200 of array arr. The code multiplies the index value by the number of bytes in an int (sizeof(int), probably 4 on your target), adds this offset (800) to the address of arr, and generates the code to read 4 bytes at this address and pass them to printf.

The computed address may be invalid and generate a runtime error such as a segmentation fault, but the C Standard just specifies this as undefined behavior, ie: anything can happen.

chqrlie
  • 131,814
  • 10
  • 121
  • 189