Why is my recursive Fibonacci implementation so slow compared to an iterative one?

Question

I have created the following simple Fibonacci implementations:

#![feature(test)]
extern crate test;

pub fn fibonacci_recursive(n: u32) -> u32 {
    match n {
        0 => 0,
        1 => 1,
        _ => fibonacci_recursive(n - 1) + fibonacci_recursive(n - 2)
    }
}

pub fn fibonacci_imperative(n: u32) -> u32 {
    match n {
        0 => 0,
        1 => 1,
        _ => {
            let mut penultimate;
            let mut last = 1;
            let mut fib = 0;
            for _ in 0..n {
                penultimate = last;
                last = fib;
                fib = penultimate + last;
            }
            fib
        }
    }
}

I created them to try out cargo bench, so I wrote the following benchmarks:

#[cfg(test)]
mod tests {
    use super::*;
    use test::Bencher;

    #[bench]
    fn bench_fibonacci_recursive(b: &mut Bencher) {
        b.iter(|| {
            let n = test::black_box(20);
            fibonacci_recursive(n)
        });
    }

    #[bench]
    fn bench_fibonacci_imperative(b: &mut Bencher) {
        b.iter(|| {
            let n = test::black_box(20);
            fibonacci_imperative(n)
        });
    }
}

I know that a recursive implementation is generally slower than an imperative one, especially since Rust doesn't support tail recursion optimization (which this implementation couldn't use anyways). But I was not expecting the following difference of nearly 2'000 times:

running 2 tests
test tests::bench_fibonacci_imperative ... bench:          15 ns/iter (+/- 3)
test tests::bench_fibonacci_recursive  ... bench:      28,435 ns/iter (+/- 1,114)

I ran it both on Windows and Ubuntu with the newest Rust nightly compiler (rustc 1.25.0-nightly) and obtained similar results.

Is this speed difference normal? Did I write something "wrong"? Or are my benchmarks flawed?

You've implemented entirely different algorithms. Your recursive implementation keeps re-evaluating intermediate calculations, and is O(2^n), you imperative one is O(n). This is a well-known result. Choose a different implementation for the recursive one. Here are some alternative implementations, albeit not in rust: https://stackoverflow.com/a/24439070/2166798 — pjs, Mar 01 '18 at 14:57
Simple. You are recalculating the same values over and over in your recursive implementation. For example, think about how f(n) calculates f(n-1) and f(n-2), but f(n-1) has already calculated f(n-2) which you are not taking advantage of. See here: https://stackoverflow.com/q/360748/912144 — Shahbaz, Mar 01 '18 at 14:58
Thanks guys, that makes sense :) Going to rewrite the function to use memoization and then run `cargo bench` again for an answer. If any of you wants to write it instead, I will accept that. — Jan Hohenheim, Mar 01 '18 at 15:08

score 10 · Accepted Answer · edited Mar 01 '18 at 16:34

10

As said by Shepmaster, you should use accumulators to keep the previously calculated fib(n - 1) and fib(n - 2) otherwise you keep calculating the same values:

pub fn fibonacci_recursive(n: u32) -> u32 {
    fn inner(n: u32, penultimate: u32, last: u32) -> u32 {
        match n {
            0 => penultimate,
            1 => last,
            _ => inner(n - 1, last, penultimate + last),
        }
    }
    inner(n, 0, 1)
}

fn main() {
    assert_eq!(fibonacci_recursive(0), 0);
    assert_eq!(fibonacci_recursive(1), 1);
    assert_eq!(fibonacci_recursive(2), 1);
    assert_eq!(fibonacci_recursive(20), 6765);
}

last is equivalent to fib(n - 1).
penultimate is equivalent to fib(n - 2).

edited Mar 01 '18 at 16:34

Jan Hohenheim

3,552
2
17
42

answered Mar 01 '18 at 15:15

Boiethios

38,438
19
134
183

Thanks. I'm going to accept this answer because it shows people that might stumble on this how to improve it. By the way, `cargo bench` shows that this solution only takes `2 ns/iter` – Jan Hohenheim Mar 01 '18 at 15:19
2

@JanNilsFerner Looks like tail call optimization is at work ;) I read that in *some* circumstances (not guaranteed) LLVM can do TCO. – Boiethios Mar 01 '18 at 15:28
3

@Boiethios: Indeed. Basically TCO boils down to: the recursive call is the *last* thing that your function does. Here, the last line of `inner` is called `inner` so it qualifies... although it's not always as obvious in C++ or Rust because of destructors, which are executed last even though they don't appear in the code and therefore may prevent TCO. Here, no destructor, everything's fine :) – Matthieu M. Mar 01 '18 at 15:34
I just realized that your implementation has a tiny flaw in it: It underflows on zero – Jan Hohenheim Mar 01 '18 at 16:09
@JanNilsFerner Done – Boiethios Mar 01 '18 at 16:20
@Boiethios Thanks :) – Jan Hohenheim Mar 01 '18 at 16:25

score 3 · Answer 2 · edited Mar 01 '18 at 15:31

3

The algorithmic complexity between the two implementations differs:

your iterative implementation uses an accumulator: O(N),
your recursive implementation doesn't: O(1.6^N).

Since 20 (N) << 12089 (1.6^N), it's pretty normal to have a large difference.

See this answer for an exact computation of the complexity in the naive implementation case.

Note: the method you use for the iterative case is called dynamic programming.

edited Mar 01 '18 at 15:31

Jan Hohenheim

3,552
2
17
42

answered Mar 01 '18 at 15:08

Matthieu M.

287,565
48
449
722

Thank you for the more analytical answer, it's quite interesting. Also, I didn't know about the term "dynamic programming", so thanks for that link as well! – Jan Hohenheim Mar 01 '18 at 15:24
1

Or you could say the recursive version has `O(Fib(N))` complexity. In fact, [`θ(Fib(n))` is an exact bound](https://stackoverflow.com/a/22084314/224132), and `3 * Fib(N) - 3` is the number of additions performed. (Of course, the recursing and comparing against the base case costs more time than the actual addition, so better just stick to the complexity class!) – Peter Cordes Mar 01 '18 at 18:54

Why is my recursive Fibonacci implementation so slow compared to an iterative one?

2 Answers2