Why don't modern compilers catch attempts to make out-of-bounds access to arrays?

Question

If int arr[5] is initialized but more then 5 elements are stored, then the extra element gets memory allocated in separate space. The old compilers like turbo reported crash as something got overwritten But that doesn't happen with Modern Compilers, so how do they deal with this issue?

In short: they don't. They don't deal with such issue, because the code is invalid. Accessing array out-of-bounds is undefined behavior. — KamilCuk, Apr 18 '20 at 07:02
If you want someone looking over your shoulder, stopping you from shooting yourself in the foot, C is *not* the language to do that with. If you're expecting otherwise, it's time to sharpen coding skills that *strictly* utilize *defined behavior*. Anything else is a recipe for disaster. Imagine your homework runs "fine" on your rig, but crashes like a wave on the rig of your professor (or worse, a paying customer). Strict adherence to the language standard and its library, and quality analysis tools can raise your confidence, but in the end it's all about one thing: not writing crappy code. — WhozCraig, Apr 18 '20 at 07:11
There's a good well-focused question here. I might paraphrase it as *Why don't (or can't) modern compilers catch attempts to make out-of-bounds access to arrays ?* and I don't think the question deserves the contumely and scorn that has been piled on. It may be a duplicate, but that is another matter entirely. And if it is not a duplicate the site needs an answer - this is a fundamental issue in many programming languages and one which surprises and occasionally confuses a lot of novice programmers. — High Performance Mark, Apr 18 '20 at 08:31
A lot of modern compilers will diagnose some out-of-bounds access at compile time, but in general you/it can't know the actual value of an index. Your only hope is a runtime check. This is often complicated by the fact that the array may be passed as pointer and now is is hard to know if the pointer is valid. There are analogical troubles with accesses members of struct especially unions. Doing good runtime checking of memory access errors (bad index, bad pointer to array, ...) is hard to do. You can find tools such as our CheckPointer to do this. — Ira Baxter, Apr 18 '20 at 16:17

High Performance Mark · Accepted Answer · 2020-04-18T11:38:59.887

Well, I moved to reopen so I guess I should write something of an answer:

It is impossible for a compiler (or It is impossible at compile time) to check for out-of-bounds access in some very common situations. For example, if an array index expression is read from an input file, or is the result of a computation using values only established during execution, then the out-of-bounds access happens long after the compiler has finished its work.

It is burdensome for a run-time system to check for out-of-bounds accesses. That is, every such check requires a calculation of what index is to be accessed, then a check that that index is in bounds; all that on top of 'normal' operations.

To see the impact of this take an array-operation-intensive program and compile it with both no run-time bounds checking, and with run-time bounds checking, and compare the execution speeds.

It seems that the widely-used languages which are compiled (eg C, C++, Fortran) all decided to not generate array-bounds-checking by default. But their compilers provide the option to generate code with array-bounds-checking.

(Historical diversion: I have a sneaking suspicion that in the early days C would have struggled to implement array-bounds-checking at run-time since it barely distinguished between arrays and pointers and I'm not sure it always knew what the array bounds were when the code executed. Fortran, on the other hand, uses a dope vector, which includes the size of the array. Perhaps someone more knowledgable than I could correct me on this.)

As to why one language defaults to not checking array bounds at run time, and another language defaults to checking, that's a matter for the language designers.

(Hysterical diversion: I think there were probably two reasons for not checking array bounds by default; one, the performance reason; two, these languages were hewn from stone by programmers who were real men who didn't need no stinking help from a machine to write code.)

And you can choose to work with such a language or not.

Yeah, I think most of the Algol family of languages do, even the estranged relatives. — High Performance Mark, Apr 18 '20 at 11:36
I'm not sure about Fortran. I *expect* accessing an `array[2][3]` as if it were `array[3][2]` would be valid in Fortran, just like in C. (think: overlays, libraries) [and the ofset-index-ranges, like `array m[10..20]` could just be a compile-time thing.) — wildplasser, Apr 18 '20 at 11:41
I'm pretty sure about Fortran, attempting to access element `(3,2)` of an array declared with dimensions `(2,3)` will result in an out-of-bounds error at run time (provided, of course, one has compiled with the option set). But that's the kind of tweaky, language-specific detail I didn't want to get into in my answer to a general question. If anyone wants to know how Fortran behaves in the situation you outline, I'm sure they'll ask. — High Performance Mark, Apr 18 '20 at 11:49
I've used Fortran compilers that did not check out-of-bounds subscripts. (In fact, I've never used one that *did* check bounds, but I have only used about 3 or 4 different varieties, all long before F90.) — torek, Apr 18 '20 at 22:55

Why don't modern compilers catch attempts to make out-of-bounds access to arrays?

1 Answers1

Linked