Why does Rust check array bounds at runtime, when (most) other checks occur at compile time?

Question

If you try to use a subscript that is not in the array, you will get an error: array access is bounds-checked at run-time.

Why does Rust check array bounds at runtime, when it seems most other checks occur at compile time?

possible duplicate of [Why does Rust compiler allow index out of bounds?](http://stackoverflow.com/questions/24898579/why-does-rust-compiler-allow-index-out-of-bounds) — Shepmaster, Feb 08 '15 at 01:22
@Shepmaster that other question is about vectors, whereas this one is about arrays. — mherzl, Nov 22 '22 at 01:37

score 19 · Accepted Answer · edited Nov 04 '18 at 14:39

19

Because checking indices at compile time is not feasible in the general case. Reasoning about the possible values of arbitrary variables is somewhere between hard and impossible even for small programs. Nobody wants to have to:

formally prove that the index can't be out of bounds, and
encode that proof into the type system

... for every single slice/Vec/etc. access, because that's what you'd have to do to perform bounds checks at compile time. You essentially need dependent typing.

Aside from possibly making type checking undecidable (and getting a program to type check vastly harder), type inference becomes impossible in general (and far more restricted in the best case), types get much more complicated and wordy, and the complexity of the language increases significantly. That indices are in bounds can only be proven without significant additional programmer effort in very simple circumstances.

Furthermore, there is little incentive to get rid of bounds checks. Lifetimes pull their weight by almost entirely eliminating the need for garbage collection --- which is a huge, invasive feature with unpredictable throughput, space and latency implications. Run-time bounds checking in the other hand is very non-invasive, has a small and well-known overhead, and can be selectively turned off in performance-critical sections even if the entire rest of the program uses it liberally.

Note that the compiler can do some simple checks for out-of-bounds access of arrays:

let a = [1, 2];
let element = a[100];

error: index out of bounds: the len is 2 but the index is 100
 --> src/main.rs:3:19
  |
3 |     let element = a[100];
  |                   ^^^^^^
  |
  = note: #[deny(const_err)] on by default

However, this is limited and is easily avoided by making the index value not an "obvious" constant:

let a = [1, 2];
let idx = 100;
let element = a[idx];

edited Nov 04 '18 at 14:39

Shepmaster

388,571
95
1,107
1,366

answered Feb 08 '15 at 01:51

3

It is a reach to suggest run times bounds checking is "very non-invasive". The impact will be related to the complexity of algorithms that are working on the array. For measures like run time, essentially adding a bounds check on each array access is a constant multiplier. – Rob Feb 08 '15 at 02:45
@Rob A constant factor on the running time of some operation never changes algorithm complexity. It may very well have unacceptable impact in the (non-asymptotic) running time. But *as I said*, for any individual array access the programmer can opt out of bounds checking, and if they do, performance is equivalent to C. Bounds checking does not impact any part of the program that doesn't use it. And *that* is what I mean by non-invasive (I mention the performance impact elsewhere in the same sentence.) – Feb 08 '15 at 10:49
1

I didn't suggest there would be any change of algorithm complexity. Being able to opt out of a runtime feature doesn't make it non-invasive. – Rob Feb 08 '15 at 11:16
@Rob How is it invasive? Or conversely, what would be a non-invasive run-time feature? – Feb 08 '15 at 11:21
1

To answer your question in reverse, an invasive runtime feature is one that has a wide-spread impact that is potentially significant at run time in realistic scenarios unless explicitly disabled (or not employed). That is, admittedly, an imperfect adaptation of english-language medical definitions (invasive procedures are wide-spread and/or have intense local effect). A non-invasive feature is the opposite. BTW: I'm not suggesting runtime bounds checking is good or bad, just that the term non-invasive seems to promise too much. – Rob Feb 08 '15 at 12:00
@Rob Isn't any medical procedure which breaks the skin or enters a body cavity invasive, no matter how localized and small? Regardless, I take note of your definition, but don't think it's useful, since almost everything seems to be invasive under it. But fighting about terminology is rarely productive. We seem to agree about everything except the meaning of "invasive", that's good enough for me. – Feb 08 '15 at 12:12
In the case of C the cost of software bounds checking is actually bad in many situations and average overhead is 67% on SPEC: https://www.cis.upenn.edu/acg/papers/pldi09_softbound.pdf Is the situation much better for Rust? Does anyone have any data to back up your small overhead claim for Rust bounds checking? – Catalin Hritcu Jun 23 '17 at 23:48
This answer is extremely dissatisfying. The type of an array in rust is `[type; length]`, is it not? So for arrays, which in Rust are by definition a fixed-length collection of values of the same type, why couldn't bounds be checked at compile time? It works for tuple fields. – Chase May Oct 13 '18 at 20:37
2

@ChaseMay Rust knows array's length, but may not always know the index value at compile time, which can be a variable whose value is returned from a function call. Tuple is different in that, tuple's index is actually the field name (like the field name of a struct), which cannot be a variable. – Xiao-Feng Li Jul 14 '19 at 20:13

score 0 · Answer 2 · answered Nov 22 '22 at 02:01

The reason is because: Although the array's length may be known ahead of time, the index being referenced may not be known ahead of time.

Simple example:

fn main() {
    let a = [1,2,3,4,5];
    let index = 10;

    let element = a[index];
}

In rust arrays are stack-variables, so their length does not change. The length of the array will therefore be known at compile-time.

After seeing an example like the above, it may seem that rust should be able to check for 'index out-of-bounds' errors at compile time.

However, although the array size is known at compile time, often the index at which the array is being referenced is not known at compile time.

Consider this more complex example:

fn main() {
    let a = [1,2,3,4,5];

    let mut guess = String::new();
    io::stdin().read_line(&mut guess).expect("Failed to read line");
    let guess: u32 = guess.trim().parse().expect("Please type a number!");

    let element = a[guess];
}

The index at which the array will be accessed is coming from user input. There is no way for the compiler to know the index at which the array will be accessed. The only way to know the index here, is to run the program. That's why rust deals with array-index-out-of-bounds errors then, at runtime, instead of at compile-time.

Why does Rust check array bounds at runtime, when (most) other checks occur at compile time?

2 Answers2

Linked