2

Is there a noticeable computation time difference between recursion-style Fibonacci vs. loop-style Fibonacci? I keep running Fibonacci to 40 places using recursion and then using a loop directly afterwards. It seems as though the computation time difference is only academic.

Written in C

Recursive solution:

    int main(int argc, const char * argv[]) {
    int n, i = 0, c;
    printf("Please enter an integer:\n");
    scanf("%d", &n);
    for ( c = 1 ; c <= n ; c++ )
    {
        printf("%lu ", fibonacci(i));
        i++;
    }
    return 0;
    }

long fibonacci(long n)
{
    if ( n == 0 )
        return 0;
    else if ( n == 1 )
        return 1;
    else
        return ( fibonacci(n-1) + fibonacci(n-2) );
};

For-loop solution:

int main(int argc, const char * argv[]) {
int n, first = 0, second = 1, next, c;
    printf("Please enter an integer:\n");
    scanf("%d", &n);
    for ( c = 0 ; c < n ; c++ )
    {
        if ( c <= 1 )
            next = c;
        else
        {
            next = first + second;
            first = second;
            second = next;
        }
        printf("%d ",next);
    }
    return 0;
};
  • Looks like you've answered your own question in the first paragraph already. – ci_ Mar 16 '15 at 22:18
  • 1
    The run time for a single computation will not be significant. If you want to benchmark, you need to compute many (10^7?) times. – mattm Mar 16 '15 at 22:38
  • Interesting. I've become extremely interested in the Fibonacci sequence recently. How its found in so many representations in nature is beautiful and how this sequence is used so often in benchmarking machines. – Frank Palmisano Mar 16 '15 at 22:42
  • 1
    "40 places" ? - a 64 bit unsigned integer only holds 19 digits. The max value of n for a 64 bit unsigned integer is 93. – rcgldr Mar 16 '15 at 23:11
  • What I mean by that is 40 indices of the Fibonacci sequence. – Frank Palmisano Mar 16 '15 at 23:44
  • Try n = 50 instead of 40. Then try n = 80 and see if it makes a difference. – gnasher729 Mar 17 '15 at 01:08
  • There's a reason why theory tells you it takes longer. Even if the compiler optimises recursive to iterative, it doesn't change the fact that the recursive method has redundant calls and will always take longer. – softwarenewbie7331 Mar 17 '15 at 01:25

5 Answers5

4

The conventional recursion method is extremely slow compared to the tail recursive and iterative versions. In the example code below for the iteration version uses an unfolded loop along with Duff's Device to enter the loop. For 32 bit unsigned integers, the limit is fib(47), for 64 bit unsigned integers, the limit is fib(93).

Timing was done with Intel 2600K 3.4ghz, XP X64, 64 bit mode. XP or XP X64 high performance counter frequency is the same as the cpu clock, 3.4ghz, but operating system overhead (like interrupts), affects the timing if the duration is small.

Timings for fib(40):

fibr() # of microseconds    485468.8
fibt() # of microseconds         0.2
fibi() # of microseconds         0.2

Timing for 94 loops, n = 0 to 93:

fibt() # of microseconds         7
fibi() # of microseconds         5

Example code:

typedef unsigned long long UI64;

UI64 fibr(UI64 n)
{
    if(n < 2)
        return n;
    return fibr(n-1) + fibr(n-2);
}

// call with fibt(n, 0, 1)
UI64 fibt(UI64 n, UI64 res, UI64 next)
{
    if (n == 0)
        return res;
    return fibt(n - 1, next, res + next);
}

UI64 fibi(UI64 n)
{
UI64 f0, f1, i;
    if(n < 2)
        return n;
    n -= 2;
    f1 = f0 = 1;
    i = 0;
    switch(n%8){
        do{
            f1 += f0;
          case 7:
            f0 += f1;
          case 6:
            f1 += f0;
          case 5:
            f0 += f1;
          case 4:
            f1 += f0;
          case 3:
            f0 += f1;
          case 2:
            f1 += f0;
          case 1:
            f0 += f1;
          case 0:
            continue;
        }while(n >= (i += 8));
    }
    return f0;
}

Alternate version of fibi(), without the n<2 check. What f0 and f1 represent changes within the loop designed to end up with the final sum in f0, so the initial state of what f0 and f1 represent depends if n is even or odd. If n is even, f0 = fib(0) = 0, f1 = fib(-1) = 1, if n is odd, f1 = fib(0) = 0, f0 = fib(-1) = 1 . (In case you're curious, fib(-1) = 1, fib(-2) = -1, fib(-3) = 2, fib(-4) = -3, fib(-5) = 5, fib(-6) = -8, ... ).

To explain the logic here, for the n even case, fib(-1) = f1 = 1, fib(0) = f0 = 0, then fib(1) = (f1 += f0), fib(2) = (f0 += f1), fib(3) = (f1 += f0), fib(4) = (f0 += f1), ... .

UI64 fibi(UI64 n)
{
UI64 f0, f1, i;
    f0 = n & 1;         // if n even, f0=0, f1=1
    f1 = 1 - f0;        // else       f1=0, f0=1
    i = 0;
    switch(n%8){
        do{
            f1 += f0;
          case 7:
            f0 += f1;
          case 6:
            f1 += f0;
          case 5:
            f0 += f1;
          case 4:
            f1 += f0;
          case 3:
            f0 += f1;
          case 2:
            f1 += f0;
          case 1:
            f0 += f1;
          case 0:
            continue;
        }while(n >= (i += 8));
    }
    return f0;
}
rcgldr
  • 27,407
  • 3
  • 36
  • 61
  • This is definitive proof now. Thank you very much for putting this amazing and beautiful sequence to the test! These results are extremely interesting. I've never seen a switch statement used for this but it is aggressively efficient. I am just in **love** with this sequence. – Frank Palmisano Mar 17 '15 at 04:41
  • 1
    @FrankPalmisano - The other trick here is only two variables instead of three are used in the unfolded loop. The sense of what f0 and f1 represent changes in the unfolded loop, ending up with the correct value in f0. – rcgldr Mar 17 '15 at 05:48
  • I keep debugging to see this. If only we could find a way to limit the number of registers used as well for the switch statement. However, better to use registers vs have it in the stack; with the recursive function that is. – Frank Palmisano Mar 17 '15 at 05:53
  • 1
    @FrankPalmisano - if you're running in 32 bit mode, you can change the code to use 32 bit integers: | typedef unsigned int UI32; |, and replace the UI64's with UI32's. The limit is fib(47) instead of fib(93). This will reduce the number of registers. The switch statement only needs one register to index a jump table, but if there's a spare register, it may also use a register to point to the base adress of the jump table. – rcgldr Mar 17 '15 at 06:18
  • 1
    @FrankPalmisano - I added a second example of fibi(), showing that the initial values of f0 and f1 depend if n is even or odd. The first example doesn't have this problem because the initial values are f0 == f1 == 1. – rcgldr Mar 17 '15 at 06:20
2

A for-loop it is not necessarily faster. In general languages like Java, C, and Python, recursion is fairly expensive compared to iteration because it requires the allocation of a new stack frame.

It is possible to eliminate this overhead in C/C++ enabling compiler optimization to perform tail recursion, which transforms certain types of recursion (actually, certain types of tail calls) into jumps instead of function calls. To let the compiler performs this optimization it is necessary that the last thing a function does before it returns is call another function (in this case itself).

An example of Fibonacci function could be like this:

int fib_tail(int n, int res, int next) 
{
  if (n == 0) {
    return res;
  }
  return fib_tail(n - 1, next, res + next);
} 

and at assembly level, enabling compiler optimization, it will be implemented as a loop, e.g. sharing the same stack frame between calls.

I recently wrote an article about it.

Hope it helps.

codingadventures
  • 2,924
  • 2
  • 19
  • 36
  • Thank you for your explanation. Its nice to know that the compiler will do all of the optimization and cause even smaller differences in time complexity. However mathematically, the other response is correct as well. Your answer is more valid in real world scenarios where it really counts. – Frank Palmisano Mar 16 '15 at 22:55
  • 1
    you're welcome. Be very careful though. Tail recursion is not guaranteed by the C/C++ compiler so you always have to check if it enabled or not (i.e. look at the assembly). Worst problem you might have is getting a stackoverflow exception when using a recursive function which has not been optimized for some reason – codingadventures Mar 16 '15 at 23:04
  • Which C or C++ compiler are you using that performs tail recursion for the fibonacci function? Just curious. I'm quite sure Clang doesn't. – gnasher729 Mar 17 '15 at 01:20
  • This does not answer OP's question and probabiy confuses him further. OP has a misguided notion that naive recursive has equivalent complexity as iterative DP methods in 'practical world scenarios' after testing with n=40. – softwarenewbie7331 Mar 17 '15 at 01:26
1

For-loop solution is faster. reasons:

  1. no function calls (assuming that function calls are expensive)
  2. computing the nth uses n additions (the loop iterates n times), while the recursive solution uses an addition per function call, which sums to O(1.6n) calls, hence O(1.6n) additions. The cost came from the double recursive calls- when the recursive function asked for the nth element, it had to compute again the n-1th and the n-2th elements from the start, it not remember them.
Community
  • 1
  • 1
avim
  • 979
  • 10
  • 25
  • Mathematically your description makes sense. It's clear to see that recursion is slower growing than looping. However when talking about compiling as in the other answer, further optimization seems to have been made. – Frank Palmisano Mar 16 '15 at 22:52
  • 1
    True. But the main issue is to understand that there is a big difference in time complexity between the methods, the reason why they taking almost the same time is actually because the compiler makes those cases almost the same – avim Mar 16 '15 at 22:58
  • It's not 2^n calls. It's about 1.608^n calls. – gnasher729 Mar 17 '15 at 01:22
0

How did you measure the speed difference?

The naive recursive implementation of the Fibonacci function takes about 100 million function calls to calculate f (40). On a modern computer that will be fast enough that you cannot time it with a stop watch.

Calculating f (50) takes about 10 billion function calls, which will be a noticable delay. f (60) takes over a trillion function calls, or about an hour. f (70) takes about 200 trillion function calls or a few days. f (80) takes about 20 quadrillion function calls or about a year.

I wouldn't call that difference academic.

gnasher729
  • 51,477
  • 5
  • 75
  • 98
0

May be the way below will take less time? You can write a code generating a Fibonacci series avoiding the if-else statement that prints zero and one, and avoiding printing them outside the loop. You can do it by initializing the 'first' and 'second' variables with -1 and 1, so the sum between them will give you 0, which is the first organ of the series, and the loop will do the rest of the job.

#include <iostream>

using namespace std;
int main()
{
    int i, num, a = -1, b = 1, temp;
    cout << "enter a number:" << endl;
    cin >> num;
    for ( i = 0 ; i < num + 1 ; i++ ) 
    {
        cout << a + b << " "; 
        temp = a + b; 
        a = b;
        b = temp;
    }
    cout << endl;
    return 0;
}
ShimonD
  • 1
  • 2