4

As I was reading the rust documentation, I stumbled upon this code that iterates the array a using a while loop (with an index):

fn main() {
    let a = [10, 20, 30, 40, 50];
    let mut index = 0;

    while index < 5 {
        println!("the value is: {}", a[index]);

        index += 1;
    }
}

The documentation says:

... this approach is error prone; we could cause the program to panic if the index length is incorrect. It’s also slow, because the compiler adds runtime code to perform the conditional check on every element on every iteration through the loop.

The first reason was self-explanatory. The second reason was where I got confused.

Furthermore, they suggested to use a for-loop for this.

fn main() {
    let a = [10, 20, 30, 40, 50];

    for element in a.iter() {
        println!("the value is: {}", element);
    }
}

I just can't seem to wrap my head around this. Is there some kind of behavior that the Rust compiler does?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Chubo Chan
  • 43
  • 2
  • What exactly is your question? – melpomene May 28 '19 at 16:26
  • https://doc.rust-lang.org/reference/expressions/loop-expr.html#iterator-loops, please check the `is equivalent to` part – Ömer Erden May 28 '19 at 16:33
  • 1
    Think of the operation `a[index]` for an arbitrary slice `a`. What must the program do to ensure that reading the element at position `index` is safe? Now read that paragraph again. – E_net4 May 28 '19 at 16:33

1 Answers1

6

The two parts are complementary:

we could cause the program to panic if the index length is incorrect.

Every time you write some_slice[some_index], the standard library does the equivalent of:

if some_index < some_slice.len() {
    some_slice.get_the_value_without_checks(some_index)
} else {
    panic!("Hey, stop that");
}

the compiler adds runtime code to perform the conditional check on every element

In a loop, this works out to be something like:

while some_index < limit {
    if some_index < some_slice.len() {
        some_slice.get_the_value_without_checks(some_index)
    } else {
        panic!("Hey, stop that");
    }
    some_index += 1;
}

Those repeated conditionals aren't the most efficient code.

The implementations of Iterator for slices utilize unsafe code to be more efficient at the expense of more complicated code. The iterators contain raw pointers to the data but ensure that you can never misuse them to cause memory unsafety. Without needing to perform that conditional at each step, the iterator solution is often faster1. It's more-or-less equivalent to:

while some_index < limit {
    some_slice.get_the_value_without_checks(some_index)
    some_index += 1;
}

See also:


1 — as Matthieu M. points out:

It should be noted that the optimizer may (or may not) be able to remove the bounds check in the while case. If it succeeds, then performance is equivalent; if it fails, suddenly your code is slower. In microbenchmarks, with simple code, changes are it will succeed... but this may not carry to your production code, or it may carry now, and the next change in the loop body will prevent the optimization, etc... In short, a while loop can be a performance ticking bomb.

See also:

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
  • This hit the nail! Thank you very much. – Chubo Chan May 29 '19 at 05:36
  • 2
    It should be noted that the optimizer may (or may not) be able to remove the bounds check in the `while` case. If it succeeds, then performance is equivalent; if it fails, suddenly your code is slower. In microbenchmarks, with simple code, changes are it will succeed... but this may not carry to your production code, or it may carry now, and the next change in the loop body will prevent the optimization, etc... In short, a `while` loop can be a performance ticking bomb. – Matthieu M. May 29 '19 at 13:00
  • @Shepmaster how does the iteration version detect end of data? Won’t it have to keep checking for some sentinel or length each iteration? – User 10482 Sep 16 '21 at 14:33
  • @User10482 yes, both implementations need to do a check to ensure that the iteration is finished. The indexing form can perform an _additional_ bounds check on top of that. The [implementation](https://github.com/rust-lang/rust/blob/1d71ba862309d59df710078a845c8772ffb22aba/library/core/src/slice/iter/macros.rs#L134-L151) is heavily optimized. – Shepmaster Sep 28 '21 at 16:40