Why does memoization not work?

Question

After reading a memoization introduction I reimplemented the Fibonacci example by using a more general memoize function (only for learning purposes):

memoizer :: (Int -> Integer) -> Int -> Integer
memoizer f = (map f [0 ..] !!)

memoized_fib :: Int -> Integer
memoized_fib = memoizer fib
    where fib 0 = 0
          fib 1 = 1
          fib n = memoized_fib (n-2) + memoized_fib (n-1)

This works, but when I just change the last line to the following code, memoization suddenly does not work as I expected (the program becomes slow again):

          fib n = memoizer fib (n-2) + memoizer fib (n-1)

Where is the crucial difference w.r.t. to memoization?

related questions: http://stackoverflow.com/q/11466284 , http://stackoverflow.com/q/11473130 . — Will Ness, Aug 18 '12 at 20:36

Will Ness · Accepted Answer · 2012-08-19T09:10:23.963

6

It is about explicit vs. implicit sharing. When you explicitly name a thing, it naturally can be shared, i.e. exist as separate entity in memory, and reused. _{(Of course sharing is not part of the language per se, we can only nudge the compiler ever so slightly towards sharing certain things).}

But when you write same expression twice or thrice, you rely on compiler to replace the common sub-expressions with one explicitly shared entity. That might or might not happen.

Your first variant is equivalent to

memoized_fib :: Int -> Integer
memoized_fib = (map fib [0 ..] !!)  where
        fib 0 = 0
        fib 1 = 1
        fib n = memoized_fib (n-2) + memoized_fib (n-1)

Here you specifically name an entity, and refer to it by that name. But that is a function. To make the reuse even more certain, we can name the actual list of values that gets shared here, explicitly:

memoized_fib :: Int -> Integer
memoized_fib = (fibs !!)  where
        fibs = map fib [0 ..] 
        fib 0 = 0
        fib 1 = 1
        fib n = memoized_fib (n-2) + memoized_fib (n-1)

The last line can be made yet more visually apparent, with explicit reference to the actual entity which is shared here - the list fibs which we just named in the step above:

        fib n = fibs !! (n-2) + fibs !! (n-1)

Your second variant is equivalent to this:

memoized_fib :: Int -> Integer
memoized_fib = (map fib [0 ..] !!)  where
        fib 0 = 0
        fib 1 = 1
        fib n = (map fib [0 ..] !!) (n-2) + (map fib [0 ..] !!) (n-1)

Here we have three seemingly independent map expressions, which might or might not get shared by a compiler. Compiling it with ghc -O2 seems to reintroduce sharing, and with it the speed.

edited Aug 19 '12 at 09:10

answered Aug 18 '12 at 21:05

Will Ness

70,110
9
98
181

In the second example, you could just write `fib n = fibs !! (n-2) + fibs !! (n-1)`. I'm not sure if that defeats your point, but I'm also not certain that the version you wrote shares. – Ben Millwood Aug 18 '12 at 22:03
From an aesthetic point of view I would stick to Will Ness' version of the second solution because it is more readable. – Bastian Aug 19 '12 at 08:05
@BenMillwood yes you could; but the two behaved exactly the same when I checked. I.e., exhibited sharing under -O2 compilation. Because it's the named list `fibs` that gets shared here (which was my point). If the OP's 1st version shared, then my re-write of it too has to share, because it explicitly names the expression that the compiler shared even when unnamed. – Will Ness Aug 19 '12 at 08:54
@Bastian I actually prefer the more explicit version with `fibs !! (n-2) + ... `, because there we *see* the explicit reference to the list which is actually the entity which is shared - the list `fibs`. Just tried not to make too long a post. Will edit this in. – Will Ness Aug 19 '12 at 08:57
@ Will Ness Please be patient with me for another 2 days - due to an exam on Friday I currently don't have the time to bother with this problem. I will get back to it at the weekend. But thank you for your dedication! :-) – Bastian Aug 22 '12 at 18:34
So I just reread your post and am going with it. Is there any Haskell reference to this topic to get more familiar with it (e.g. some book about Haskell)? – Bastian Aug 28 '12 at 08:30
@Bastian have no idea. :) really. (Haskell has no documentation, is what I think). You can try some more SO entries linked on the right here. The main thing is - memoization or not, sharing or not, is not part of the Language, at best it is a part of a compiler. Even when we name stuff, compiler might choose to recalculate it instead of reusing it. We have **no** control in the matter. Some say it is *missing* in Haskell (in that respect). – Will Ness Aug 28 '12 at 08:38

ony · Answer 2 · 2012-08-18T16:48:18.277

3

momoized_fib = ... - that's top-level simple definition. it might be read as a constant lazy value (without any additional arguments required to be bound before expanding it. That's kinda "source" of your memoized values.

When you use (memoizer fib) (n-2) creates new source of values which have no relation with memoized_fib and thus it isn't reused. Actually you move a lot of load on GC here since you produce a lot (map fib [0 ..]) sequences in second variant.

Consider also more simple example:

f = \n -> sq !! n where sq = [x*x | x <- [0 ..]]
g n = sq !! n where sq = [x*x | x <- [0 ..]]

First will generate single f and associated with it sq because there is no n in head of declaration. Second will produce family of lists for each different value of f n and move over it (without bounding down to actual values) to get value.

edited Aug 18 '12 at 16:48

answered Aug 18 '12 at 16:37

ony

12,457
1
33
41

I think `g n = ...` is a syntactic sugar for `g = \n-> ...`. There shouldn't be any difference. You're thinking about `f = (sq !!) where sq = ...`, but that's not what you wrote. Plus, the definition of `sq` is actually *not dependent* on *n*, so compiler is free to reuse the `sq` list. Unless prevented by polymorphism in the absence of monomorphic type signature of course. – Will Ness Aug 18 '12 at 20:43
3

@WillNess: the key is whether or not the `where` clause is inside or outside the lambda, which the syntactic sugar *does* change, since `where` only attaches to declarations, not expressions. – Ben Millwood Aug 18 '12 at 22:01
@ony turns out I was wrong, sorry, :) `where` goes with the `f`, so you indeed wrote what you thought you wrote, and not what I thought I read. :) :) – Will Ness Aug 19 '12 at 09:37
@BenMillwood thanks for the correction! so the difference is between the [pattern binding](http://www.haskell.org/onlinereport/haskell2010/haskellch4.html#x10-860004.4.3.2) for `f` producing `f = let sq=[...] in (\n-> sq !! n)`, and a [function binding](http://www.haskell.org/onlinereport/haskell2010/haskellch4.html#x10-840004.4.3.1) for `g` producing an equivalent of `g = \n-> let sq = [...] in sq !! n` - correct? Still, compiler can float the `let sq =` (right?) out of the lambda in `g`, theoretically, since the definition of `sq` does not refer to `n` anywhere... unless it's polymorphic... – Will Ness Aug 19 '12 at 10:41
@WillNess: Yup, that's correct and it could indeed be floated out. It probably isn't a good idea, though - `sq` is very cheap to create, so keeping it in memory is a waste. It's better to be explicit - generally, you do not want to depend on compiler optimizations to improve asymptotic behaviour of your code (as with `fib`). – Vitus Aug 19 '12 at 14:18
@Vitus thanks for the confirmation! :) as for `sq`, it was just an example... About explicitness, couldn't agree more, that's exactly the point I tried to make in [my answer here](http://stackoverflow.com/a/12022253/849891). Explication is a valuable design principle in its own right too. – Will Ness Aug 19 '12 at 14:57

Why does memoization not work?

2 Answers2