4

I'm trying to understand recursions and caching much better and still I'm making interesting progress(I'm a bit slow at times). I'm having small problems understanding this code:

# Fibonacci recursive with result storage

class FibonacciStorage:
    _storage = { 0:1, 1:1 }
    @staticmethod
    def _fib(number):
        try: # is this already calculated
            return FibonacciStorage._storage[number]   #searches dict, and if value exists then return it
        except KeyError:
            result = FibonacciStorage._fib(number-1) + FibonacciStorage._fib(number-2)  #this is the actual Fibonacci Formula
            FibonacciStorage._storage[number] = result  #adds result to dictionary
            #print FibonacciStorage._storage #this shows the storage list growing with each iteration.
            return result
    @staticmethod
    def fib(number):  #first function, it has two asserts to basically make sure number is whole/positive and if its okay passes it to _fib(where the real work is done)
        # only do the assert statements once
        assert(isinstance(number,int)),"Needs a whole number" 
        assert(number>0),"Needs a positive whole number"
        return FibonacciStorage._fib(number)

# use a more readable name
getFib = FibonacciStorage.fib

print getFib(50)

I found this code on the net and tried to comment each line to understand what it is actually doing. I don't understand how this is a recursion, it does loop through the code until the correct result is given but I don't see how its calling itself.

I thought it was this line, at first: result = FibonacciStorage._fib(number-1) + FibonacciStorage._fib(number-2) but I'm confused because my print FibonacciStorage._storage line shows the storage cache growing and growing, yet the line below it is a return function. I suspect I do not understand how return result is actually somehow triggering the recursion again.

I'm also not sure how the storage dictionary is making this program go so fast. My simple recursions of the fib sequence take a long time(even when I save the data) but in this case if you do a print getFib(200) its instant. How is the caching making it so fast?

So to sum up, two questions:

1) How is the recursion actually being triggered? If my print statement is being accessed on each loop?

2) How is caching speeding this up over other fib sequences? Is the class structure or @staticmethod making a difference? For example, this seems instant while the ones on http://en.literateprograms.org/Fibonacci_numbers_(Python) have a slight delay.

3) Maybe also out of intrest, can a fibnocci type algo scale(split up and run in parallel) just out of intrest or are they limited to one process?

Any insight would be helpful.

Lostsoul
  • 25,013
  • 48
  • 144
  • 239
  • 2
    Understanding recursion is a good endeavor. The definition of the Fibonacci series is recursive, but a recursive implementation is an _anti-pattern_, because it is incredibly wasteful. Consider that `fib(7)` will result in 24 invocations, and will calculate the value of `fib(3)` _five_ times independently. In fact the total number of invocations for `fib(n)` is `2 * fib(n) - 1`. To calculate `fib(20)` (6765) requires 13,529 invocations. – Jim Garrison Mar 09 '12 at 03:49
  • If you are using Python 2.x, always inherit your class from `object` - you will get subtle errors otherwise (like finding out "staticmethod" does not work in your class, or others) – jsbueno Mar 09 '12 at 03:52
  • @JimGarrison would a recursion be not so bad if it cached like this one above? I am trying to understand it, but from my understanding so far it doesn't repeat any of its work, so wouldn't fib(20) result in only a few more than 20 loops since most of them are lookups against the cache? – Lostsoul Mar 09 '12 at 04:06
  • Yes, caching results improves things. But you still have to make `2*fib(n)-1` calls. A simple iterative solution is the only way to go. This is just a cautionary tale to remember that for _some_ recursive problems a linear solution is better. – Jim Garrison Mar 09 '12 at 04:21
  • @JimGarrison yikes..I guess software development is never as easy as it looks. Thanks for the warning and keep hacking code to try to figure out when to use which tool..Thanks for helping guide a newbie's attempts to learn! – Lostsoul Mar 09 '12 at 04:49
  • 1
    You are already beyond the basics in your understanding, but you might find [my answer to a similar question](http://stackoverflow.com/questions/1949454/understanding-basic-recursion/1949502#1949502) interesting. – Jim Garrison Mar 09 '12 at 06:36
  • @JimGarrison Thank you very much, it was very helpful. I like the way you explained it. I'll browse some of your other answers and I know I'll learn alot. Thanks! – Lostsoul Mar 09 '12 at 06:42

3 Answers3

6

(1) The recursive call is on this line:

result = FibonacciStorage._fib(number-1) + FibonacciStorage._fib(number-2)
//                   here ^                        and here ^

You're right, this is "the actual Fibonacci formula", which is recursive by definition.

(2) Caching is speeding it up immensely. What's making a difference is the persistent object _storage, which is saving you a spectacular amount of duplicate work. Consider:

                                 fib(4)
                               /        \
                          fib(3)   +     fib(2)
                         /    \          /     \
                    fib(2) + fib(1)   fib(1) + fib(0)
                    /    \
               fib(1) + fib(0)

This is just fib(4), and you can already see the redundancy. By simply caching the integer values of fib(3) and fib(4), you'll reduce the number of calculations required for fib(5) from 7 to 1. And the savings only grow as you go higher.

calebds
  • 25,670
  • 9
  • 46
  • 74
  • but if thats the case, shouldn't it just go to the recursion? and not do my print statement below? If you uncomment it, you'll see it grows at every turn. – Lostsoul Mar 09 '12 at 03:26
  • I kind of understand what its doing, I'm intrested in understanding the programming aspect of it behind the scenes..for example, I understand caching makes it faster but I don't understand why this caching is making it faster than the one on this site: http://en.literateprograms.org/Fibonacci_numbers_(Python) – Lostsoul Mar 09 '12 at 03:36
  • @learningJava When the recursive call to `fib` `return`s, it will pick up just where it left off in the `fib` that made the call.. – calebds Mar 09 '12 at 03:40
2

The fibonacci implementation in the question uses a technique called memoization for storing the values already computed and efficiently retrieving them each time they're needed, instead of re-calculating them. This has a huge performance benefit, because values only need to be calculated once.

About your questions:

  1. The recursion is triggered in this line: FibonacciStorage._fib(number-1) + FibonacciStorage._fib(number-2), as you can see the _fib procedure is calling itself
  2. The caching is greatly improving the speed of the calculation, see this example for a complete explanation of how this works. Using static methods isn't particularly useful in this case, it only makes the implementation more contrived
  3. A recursive, non-memoized version might be split in a parallel algorithm where each call to fibonacci gets performed on a different thread. However the overhead would be considerable, and the algorithmic complexity wouldn't be improved at all. Stick to the memoized version, or read about faster ways to calculate fibonacci.
Óscar López
  • 232,561
  • 37
  • 312
  • 386
  • Thank you Oscar never heard of memoization but looks like what I wanted to learn about..thanks, but regarding the fib recursion, I thought that was calling it as well but shouldn't it immediately go and start from the top? But if you uncomment my print statement, you can see it growing in values which makes me think that print statement is being executed with every loop. If thats the case then how is the FibonacciStorage._fib(number-1) actually calling it? If it was, wouldn't my print statement never be executed because it would build the dictionary cache until the try statement works – Lostsoul Mar 09 '12 at 04:02
  • and once the try statement works, it would just return the results and finish...how does the print statement work? if the FibonacciStorage._fib(number-1) is being called right away? or is there a delay(does it wait for the function to end before doing the recursion)? Sorry, this is where I'm confused.. – Lostsoul Mar 09 '12 at 04:03
  • The print statement would get called whenever a value is not in the cache, right after it gets added. I think you should trace the execution of the program to better understand it, use a debugger. – Óscar López Mar 09 '12 at 04:08
  • okay..never done that before but I'll look up tracing..Thanks Oscar. – Lostsoul Mar 09 '12 at 04:11
1

It's good to know that Python dicts have a __missing__ method that's invoked for missing keys:

class FibonacciStorage(dict):
    def __missing__(self, n):
        self[n] = self[n - 1] + self[n - 2]
        return self[n]

Usage:

fib = FibonacciStorage({0: 1, 1: 1})
print(fib[50]) #20365011074
Reinstate Monica
  • 4,568
  • 1
  • 24
  • 35