4

The article Memoization via Representables does a great job at explaining how to memoize recursive functions via Representables. It starts by implementing the Fibonacci sequence as a fixed point of fibOp:

fibOp :: Num a => (Natural -> a) -> (Natural -> a)
fibOp v 0 = 0
fibOp v 1 = 1
fibOp v n = v (n-1) + v (n-2)

fix f = let x = f x in x

fibNaive :: Num a => Natural -> a
fibNaive = fix fibOp  

This implementation is not efficient as it calculates the same values many times.

The article goes on to introduce the mutual inverse functions streamTabulate and streamIndex (which will later be generalised in the Representable type class). These functions allow us to implement a memoized versions of fibNaive:

fibSmart :: Num a => Natural -> a
fibSmart = fix (streamIndex . streamTabulate . fibOp)

And it is at this point where the article gets somewhat "handwavy":

If we compose our tabulation function with fibOp, we get a function turns a function v into a Stream, over which we can index to get back a function. In this case, however, the same stream is shared for all arguments.

What does this mean exactly? Apparently this memoization technique works because of Haskell's lazy evaluation strategy and because it tries not to evaluate expressions "unnecessarily". For example this answer mentions that Haskell computes any given expression at most once per time that its surrounding lambda-expression is entered. The surrounding lambda-expression of streamTabulate however is entered once per iteration induced by the surrounding fix, which would mean streamTabulate is potentially evaluated multiple times.

Why and how does this approach work? Which aspect of Haskell's evaluation strategy does it exploit?

michid
  • 10,536
  • 3
  • 32
  • 59
  • Can you elaborate on "The surrounding lambda-expression of `streamTabulate`". I don't see what you mean by that. – Noughtmare Sep 29 '22 at 14:59
  • `fibSmart` can be expanded to `\n -> fix (\f -> streamIndex $ streamTabulate $ fibOp $ f) n`. So if my understanding of that other SO answer I linked is correct, the surrounding lambda expression of `streamTabulate` is `\f -> ...` . – michid Sep 29 '22 at 15:03
  • FWIW, I made all code for exploring this topic available via GitHub: https://github.com/mduerig/representable-memo – michid Oct 01 '22 at 10:56

1 Answers1

6

Your function:

fibSmart :: Num a => Natural -> a
fibSmart = fix (streamIndex . streamTabulate . fibOp)

can indeed be expanded by inlining (.) to:

fibSmart = fix (\f -> streamIndex (streamTabulate (fibOp f)))

but the lambda \f -> ... here is only entered once by the fix function.

You can also see that by further inlining fix:

fibSmart = let f = streamIndex (streamTabulate (fibOp f)) in f
user2407038
  • 14,400
  • 3
  • 29
  • 42
Noughtmare
  • 9,410
  • 1
  • 12
  • 38
  • The inlining of fix is what does the trick indeed. Fascinating! – michid Sep 29 '22 at 15:36
  • @michid, note that inlining `fix` is not required for the memoization. Adding a `{-# NOINLINE fix #-}` pragma doesn't change the performance much. – Noughtmare Sep 29 '22 at 15:47
  • For a further exercise I implemented the same thing in Java. It was interesting to see where I had to manually add lazy evaluation and memoization to make this work. Most obviously the stream needs a lazy tail. But only after manually inlining `fix` did I see the performance gain. Interestingly after inlining the memoization of `streamTabulate` becomes very explicit. https://gist.github.com/mduerig/2116556dc525f4a0a12cf6fae415fb93 – michid Sep 29 '22 at 20:24
  • @michid I'm honestly not sure how your Java `fix` works exactly, but it *looks* like it doesn't "do anything" until the function it returns is actually called (all the code is inside the `R apply(T t)` method of the `new Function`). Whereas Haskell `fix` "does its thing" once and for all *before* the returned function receives any argument (`fix` might not even be returning a function, so it *can't* work that way). By inlining it, are you effectively doing the same thing: "pre-computing" the fixpoint before you invoke it, instead of on every invocation? – Ben Sep 30 '22 at 00:24
  • @Ben, "Whereas Haskell fix "does its thing" once and for all before the returned function receives any argument". AFAICS this is the reason why this memoization technique works in Haskell but not in Java. – michid Sep 30 '22 at 05:57