9

I am learning Haskell and I the following expression on Haskell Wiki really puzzled me:

fibs = 0 : 1 : zipWith (+) fibs (tail fibs)

I can't quite figure out why this works.

If I apply standard Currying logic (zipWith (+)) returns a function takes list as an argument and, in turn, returns another function that takes another list as an argument, and returns a list (zipWith::(a -> b -> c) -> [a] -> [b] -> [c]). So, fibs is a reference to a list (that has not yet been evaluated) and (tail fibs) is the tail of the same (unevaluated) list. When we try to evaluate (take 10 fibs), the first two elements are bound to 0 and 1. In other words fibs==[0,1,?,?...] and (tail fibs)==[1,?,?,?]. After the first addition completes fibs becomes [0,1,0+1,?,..]. Similarly, after the second addition we get [0,1,0+1,1+(0+1),?,?..]

  • Is my logic correct?
  • Is there a simpler way to explain this? (any insights from people who know what Haskell complier does with this code?) (links and reference are welcome)
  • It is true that this type of code only works because of lazy evaluation?
  • What evaluations happen when I do fibs !! 4?
  • Does this code assume that zipWith processes elements first to last? (I think it should not, but I don't understand why not)

EDIT2: I just found the above question asked and well answered here. I am sorry if I wasted anyone's time.

EDIT: here is a more difficult case to understand (source: Project Euler forums):

filterAbort :: (a -> Bool) -> [a] -> [a]
filterAbort p (x:xs) = if p x then x : filterAbort p xs else []

main :: Int
main = primelist !! 10000
         where primelist = 2 : 3 : 5 : [ x | x <- [7..], odd x, all (\y -> x `mod` y /= 0) (filterAbort (<= (ceiling (sqrt (fromIntegral x)))) primelist) ]

Note that all (\y -> x mod y /= 0)... How can referring to x here NOT cause infinite recursion?

Community
  • 1
  • 1
  • That seems about right. It also seems simple to me -- as you work with Haskell more, your brain will start to pick up these patterns and it will seem simple to you too. Good start. – luqui Jun 16 '11 at 05:38
  • 2
    First, `filterAbort` is the same as `takeWhile`. Second, you can avoid the even numbers by writing `[7,9..]`. Third, there's no need to have `5` in your initial list if you use `[5,7..]`. And last, the reason this works is rather deep. It's because for each prime `p` there's another prime before `p^2`. Is is a trivial consequence of a theorem by Lindemann (a prime between p and 2p). – augustss Jun 16 '11 at 08:09
  • Thanks, augustss. The optimizations and clean up you suggest make sense. As for termination, could you elaborate? What is `x` bound to in `(\y -> x mod y /= 0)`? I suspect my mistake is in thinking that it is bound to an infinite list. If it is bound to just one value (say, `7`) then there is no problem. Can you confirm? – Vladimir Bychkovsky Jun 16 '11 at 11:14
  • 1
    The `x` is bound to a single element, coming from an infinite list `[7..]`. There's no recursion here. – yatima2975 Jun 16 '11 at 12:36

1 Answers1

16

I'll trace the evaluation for you:

> fibs = 0 : 1 : zipWith (+) fibs (tail fibs)

With:

> zipWith f (a:as) (b:bs) = f a b : zipWith f as bs
> zipWith _ _      _      = []

> tail (_:xs)             =  xs
> tail []                 =  error "tail" 

So:

  fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
↪ fibs = 0 : 1 : ((+) 0 1 : zipWith (+) (tail fibs) (tail (tail fibs)))
↪ fibs = 0 : 1 : 1 : ((+) 1 1 : zipWith (+) (tail (tail fibs)) (taii (tail (tail fibs)))))    
↪ fibs = 0 : 1 : 1 : 2 : ((+) 1 2 : zipWith (+) (tail (tail (tail fibs))) (tail (taii (tail (tail fibs))))))
↪ fibs = 0 : 1 : 1 : 2 : 3 : ((+) 2 3 : zipWith (+) (tail (tail (tail (tail fibs)))) (tail (tail (taii (tail (tail fibs)))))))

Etc. So if you index into this structure, it will force each step of fibs to be evaluated until the index is found.

Why is this efficient? One word: sharing.

As fibs is computed, it grows in the heap, recording values that have been computer. Later back references to fibs get to reuse all previously computed values for fibs. Free memoization!

What does that sharing look like in the heap?

enter image description here

As you ask for the object on the end of the list, Haskell computes the next value, records it, and moves that self-reference down a node.

The fact this terminates relies crucially on laziness.

Don Stewart
  • 137,316
  • 36
  • 365
  • 468
  • Thank you, Don. I am still a little fuzzy on how do Haskell known that in order to get the next element it needs to call fibs, and what does it mean to "call fibs" if fibs is really a list. I was looking around, and I found a [similar question](http://stackoverflow.com/questions/3887201/funky-haskell-lazy-list-implicit-recursion), but I am still not sure what the difference between '1' node in the list and 'fibs...' node to the Haskell compiler. – Vladimir Bychkovsky Jun 16 '11 at 01:09
  • 1
    You can think of `fibs` as just a pointer. So each time you refer to `fibs` it looks up the value. That value happens to be a list. – Don Stewart Jun 16 '11 at 01:16
  • Ah! That's a good point. I should correct my question. What does notation `0:1:zipWith ...` produce? It starts out as a list, which I understand. But then it gets to a node in the list that's a function call? Or it is better to think of all nodes in the list as function calls? I.e. say `ident x = x`, so `fib = (ident 0):(ident 1):(zipWith ...)`. I this case compiler evaluates *all* cells, but the last one ends up as a function call that returns a list? And `(zipWith ...)` happens to point to `fibs` list. Right? – Vladimir Bychkovsky Jun 16 '11 at 01:21
  • 3
    Simplified because I'm not an expert : Each non strict value in Haskell is stored in a structure called a thunk. It contains either a value, or some kind of pointer to code that can compute the value. So when you have `0:1:zipWith` the 3rd element is a thunk which points to code than can compute the rest of the list. When that code is run, the thunk is replaced by a cons cell which has the value for the 3rd element of the list, and a thunk that can compute the rest of the list from the 4th place. and so on. – David V. Jun 16 '11 at 06:38
  • Thanks, David. You are right understanding the concept of thunk is crucial here. Having read about [Lazy Evaluation](http://www.haskell.org/haskellwiki/Haskell/Lazy_evaluation) the first example makes sense to me. One really needs to understand what happens under the (compiler) covers, at least conceptually, to write good code. – Vladimir Bychkovsky Jun 16 '11 at 10:56