7

After reading a memoization introduction I reimplemented the Fibonacci example by using a more general memoize function (only for learning purposes):

memoizer :: (Int -> Integer) -> Int -> Integer
memoizer f = (map f [0 ..] !!)

memoized_fib :: Int -> Integer
memoized_fib = memoizer fib
    where fib 0 = 0
          fib 1 = 1
          fib n = memoized_fib (n-2) + memoized_fib (n-1)

This works, but when I just change the last line to the following code, memoization suddenly does not work as I expected (the program becomes slow again):

          fib n = memoizer fib (n-2) + memoizer fib (n-1)

Where is the crucial difference w.r.t. to memoization?

H.B.
  • 166,899
  • 29
  • 327
  • 400
Bastian
  • 4,638
  • 6
  • 36
  • 55

2 Answers2

6

It is about explicit vs. implicit sharing. When you explicitly name a thing, it naturally can be shared, i.e. exist as separate entity in memory, and reused. (Of course sharing is not part of the language per se, we can only nudge the compiler ever so slightly towards sharing certain things).

But when you write same expression twice or thrice, you rely on compiler to replace the common sub-expressions with one explicitly shared entity. That might or might not happen.

Your first variant is equivalent to

memoized_fib :: Int -> Integer
memoized_fib = (map fib [0 ..] !!)  where
        fib 0 = 0
        fib 1 = 1
        fib n = memoized_fib (n-2) + memoized_fib (n-1)

Here you specifically name an entity, and refer to it by that name. But that is a function. To make the reuse even more certain, we can name the actual list of values that gets shared here, explicitly:

memoized_fib :: Int -> Integer
memoized_fib = (fibs !!)  where
        fibs = map fib [0 ..] 
        fib 0 = 0
        fib 1 = 1
        fib n = memoized_fib (n-2) + memoized_fib (n-1)

The last line can be made yet more visually apparent, with explicit reference to the actual entity which is shared here - the list fibs which we just named in the step above:

        fib n = fibs !! (n-2) + fibs !! (n-1)

Your second variant is equivalent to this:

memoized_fib :: Int -> Integer
memoized_fib = (map fib [0 ..] !!)  where
        fib 0 = 0
        fib 1 = 1
        fib n = (map fib [0 ..] !!) (n-2) + (map fib [0 ..] !!) (n-1)

Here we have three seemingly independent map expressions, which might or might not get shared by a compiler. Compiling it with ghc -O2 seems to reintroduce sharing, and with it the speed.

Will Ness
  • 70,110
  • 9
  • 98
  • 181
  • In the second example, you could just write `fib n = fibs !! (n-2) + fibs !! (n-1)`. I'm not sure if that defeats your point, but I'm also not certain that the version you wrote shares. – Ben Millwood Aug 18 '12 at 22:03
  • From an aesthetic point of view I would stick to Will Ness' version of the second solution because it is more readable. – Bastian Aug 19 '12 at 08:05
  • @BenMillwood yes you could; but the two behaved exactly the same when I checked. I.e., exhibited sharing under -O2 compilation. Because it's the named list `fibs` that gets shared here (which was my point). If the OP's 1st version shared, then my re-write of it too has to share, because it explicitly names the expression that the compiler shared even when unnamed. – Will Ness Aug 19 '12 at 08:54
  • @Bastian I actually prefer the more explicit version with `fibs !! (n-2) + ... `, because there we *see* the explicit reference to the list which is actually the entity which is shared - the list `fibs`. Just tried not to make too long a post. Will edit this in. – Will Ness Aug 19 '12 at 08:57
  • @ Will Ness Please be patient with me for another 2 days - due to an exam on Friday I currently don't have the time to bother with this problem. I will get back to it at the weekend. But thank you for your dedication! :-) – Bastian Aug 22 '12 at 18:34
  • So I just reread your post and am going with it. Is there any Haskell reference to this topic to get more familiar with it (e.g. some book about Haskell)? – Bastian Aug 28 '12 at 08:30
  • @Bastian have no idea. :) really. (Haskell has no documentation, is what I think). You can try some more SO entries linked on the right here. The main thing is - memoization or not, sharing or not, is not part of the Language, at best it is a part of a compiler. Even when we name stuff, compiler might choose to recalculate it instead of reusing it. We have **no** control in the matter. Some say it is *missing* in Haskell (in that respect). – Will Ness Aug 28 '12 at 08:38
3

momoized_fib = ... - that's top-level simple definition. it might be read as a constant lazy value (without any additional arguments required to be bound before expanding it. That's kinda "source" of your memoized values.

When you use (memoizer fib) (n-2) creates new source of values which have no relation with memoized_fib and thus it isn't reused. Actually you move a lot of load on GC here since you produce a lot (map fib [0 ..]) sequences in second variant.

Consider also more simple example:

f = \n -> sq !! n where sq = [x*x | x <- [0 ..]]
g n = sq !! n where sq = [x*x | x <- [0 ..]]

First will generate single f and associated with it sq because there is no n in head of declaration. Second will produce family of lists for each different value of f n and move over it (without bounding down to actual values) to get value.

ony
  • 12,457
  • 1
  • 33
  • 41
  • I think `g n = ...` is a syntactic sugar for `g = \n-> ...`. There shouldn't be any difference. You're thinking about `f = (sq !!) where sq = ...`, but that's not what you wrote. Plus, the definition of `sq` is actually *not dependent* on *n*, so compiler is free to reuse the `sq` list. Unless prevented by polymorphism in the absence of monomorphic type signature of course. – Will Ness Aug 18 '12 at 20:43
  • 3
    @WillNess: the key is whether or not the `where` clause is inside or outside the lambda, which the syntactic sugar *does* change, since `where` only attaches to declarations, not expressions. – Ben Millwood Aug 18 '12 at 22:01
  • @ony turns out I was wrong, sorry, :) `where` goes with the `f`, so you indeed wrote what you thought you wrote, and not what I thought I read. :) :) – Will Ness Aug 19 '12 at 09:37
  • @BenMillwood thanks for the correction! so the difference is between the [pattern binding](http://www.haskell.org/onlinereport/haskell2010/haskellch4.html#x10-860004.4.3.2) for `f` producing `f = let sq=[...] in (\n-> sq !! n)`, and a [function binding](http://www.haskell.org/onlinereport/haskell2010/haskellch4.html#x10-840004.4.3.1) for `g` producing an equivalent of `g = \n-> let sq = [...] in sq !! n` - correct? Still, compiler can float the `let sq =` (right?) out of the lambda in `g`, theoretically, since the definition of `sq` does not refer to `n` anywhere... unless it's polymorphic... – Will Ness Aug 19 '12 at 10:41
  • @WillNess: Yup, that's correct and it could indeed be floated out. It probably isn't a good idea, though - `sq` is very cheap to create, so keeping it in memory is a waste. It's better to be explicit - generally, you do not want to depend on compiler optimizations to improve asymptotic behaviour of your code (as with `fib`). – Vitus Aug 19 '12 at 14:18
  • @Vitus thanks for the confirmation! :) as for `sq`, it was just an example... About explicitness, couldn't agree more, that's exactly the point I tried to make in [my answer here](http://stackoverflow.com/a/12022253/849891). Explication is a valuable design principle in its own right too. – Will Ness Aug 19 '12 at 14:57