7

I'm trying to understand something about Haskell functions.

First, here is a Fibonacci function defined in the typical "slow" way (i.e. recursive with no memoization, and no infinite-list tricks)

slowfib :: Int -> Integer
slowfib 0 = 0
slowfib 1 = 1
slowfib n = slowfib (n-2) + slowfib (n-1)

Next, a canonical memoizing version of the same. (Only slightly different from typical examples in tutorals/books/etc, because I prefer the prefix version of the !! operator.)

memfib = (!!) (map fib [0..])
  where
    fib 0 = 0
    fib 1 = 1
    fib k = memfib(k-2) + memfib(k-1)

The above solution uses partial application of the !! operator, which makes sense: we want memfib to end up as a function that takes a parameter, and we are defining it without including a parameter in the definition.

So far so good. Now, I thought I could write an equivalent memoizing function that does include a parameter in the definition, so I did this:

memfib_wparam n = ((!!) (map fib [0..])) n
  where
    fib 0 = 0
    fib 1 = 1
    fib k = memfib_wparam(k-2) + memfib_wparam(k-1)

(In Lambda calculus terms, memfib and memfib_wparams are just eta-conversions of each other. I think???)

This works, but the memoization is gone. In fact, memfib_wparam behaves even worse than showfib: Not only is it slower, but its memory usage is more than double.)

*Main> slowfib 30
832040
(1.81 secs, 921,581,768 bytes)
*Main> memfib 30
832040
(0.00 secs, 76,624 bytes)
*Main> memfib_wparam 30
832040
(2.01 secs, 2,498,274,008 bytes)

What's going on here? More importantly, what is my broader understanding of Haskell function definitions getting wrong? I was assuming the syntax I used in memfib_wparam was just syntactic sugar for what I did in memfib, but clearly it isn't.

Will Ness
  • 70,110
  • 9
  • 98
  • 181
cschatz
  • 258
  • 1
  • 9
  • 1
    [closely related](https://stackoverflow.com/questions/11466284/how-is-this-fibonacci-function-memoized). – Will Ness Feb 21 '20 at 20:11

1 Answers1

5

The difference is in when your fib function is bound.

where-bound definitions have access to the outer function's parameters (i.e. the parameters are "in scope" within where). This means that fib should have access to n, which in turn means that fib is defined after n is passed, which means it's a different fib for every n, which means it's a different call to map fib [0..] for every n.

If you wanted to eta-expand your memfib, this would be the "right" way to do it (i.e. without unduly expanding the scope of n):

memfib = \n -> theCache !! n
    where
        theCache = map fib [0..]

        fib 0 = 0 
        fib 1 = 1 
        fib k = memfib(k-2) + memfib(k-1)

If you're comparing with lambda calculus, the key difference is that eta-reduction/expansion doesn't say anything about performance, it just guarantees that the result of the program stays the same logically. Which it does.

Fyodor Soikin
  • 78,590
  • 9
  • 125
  • 172
  • 1
    Yes, fair point about eta-conversion and performance! Interestingly, your "right" eta-expansion performs somewhat differently from mine, but still seems to be making a different call to `map fib [0..]` for every `n`. For n=30, I get ~2.5GB allocated when running mine and ~2GB running yours (compared to ~75kB for the version with no parameter). Is it possible yours **is** still "unduly expanding the scope of `n`", just slightly differently somehow? Why? – cschatz Feb 21 '20 at 17:44
  • 1
    Yes, you are right, I actually fell into the very same pit with you :-) My version also expanded the scope of `n` - not over the `fib` definition, but over the `map` call. I have updated the answer. – Fyodor Soikin Feb 21 '20 at 18:07
  • Ok, thanks! That makes sense. Newbie user question: What's the expected management of the comments in this situation? Should I delete mine since the thing it refers to is no longer in the answer? Or is the idea to leave it, since anyone truly curious can see the edit history? – cschatz Feb 21 '20 at 23:20
  • No idea. I would leave it up as part of history. If it's truly annoying to anybody, it will get flagged and deleted later. – Fyodor Soikin Feb 21 '20 at 23:26
  • `fib` is actually bound [the very first](https://pastebin.com/6j8DWtgG) in both cases, I think, since the `where` scope is attached to definitions. the difference is whether `map fib [0..]` is floated. the decision is arbitrary in both cases (i.e. chosen by an implementation) -- there's no guarantee in the *language* for CAFs to be [memoizing](https://wiki.haskell.org/Memoization#Memoising_CAFS) AFAIK, it is something GHC implementors decided to do (?). long ago `-O2 -cse` would do this in 2nd case too. also, lack of / type signature [plays a role](https://stackoverflow.com/a/11466959/849891). – Will Ness Feb 22 '20 at 11:02