18

In a pure functional language with lazy semantics (such as Haskell), results of computations are memoized so that further evaluations of a function with the same inputs do not recompute the value but get it directly from the cache of memoized values.

I am wondering if these memoized values get recycled at some point in time?

  1. If so, it means that the memoized values must be recomputed at a later time, and the benefits of memoization are not so exiting IMHO.
  2. If not, then ok, this is clever to cache everything... but does it mean that a program - if run for a sufficient long period of time - will always consume more and more memory ?

Imagine a program performing intensive numerical analysis: for example to find roots of a list of hundred of thousands mathematical functions using a dichotomy algorithm.

Every time the program evaluates a mathematical function with a specific Real Number, the result will be memoized. But there is only a really small probability that exactly the same Real Number will appear again during the algorithm, leading to memory leakage (or at least, really bad usage).

My idea is that maybe memoized values are simply "scoped" to something in the program (for example to the current continuation, call stack, etc.), but I was unable to find something practical on the subject.

I admit I don't have looked deeply at the Haskell compiler implementation (lazy?), but please, could someone explain to me how it works in practice?


EDIT: Ok, I understand my mistake from the first few answers: Pure semantics implies Referential Transparency which in turn does not imply automatic Memoization, but just guarantees that there will be no problem with it.

I think that a few articles on the web are misleading about this, because from a beginner's point of view, it seems that the Referential Transparency property is so cool because it allows implicit memoization.

sclv
  • 38,665
  • 7
  • 99
  • 204
Gabriel Cuvillier
  • 3,617
  • 1
  • 28
  • 35

2 Answers2

20

Haskell does not automatically memoize function calls, precisely because this would quickly consume tons of memory. If you do memoziation yourself, you get to choose at what scope the function is memoized. For example, let's say you have the fibonacci function defined like this:

fib n = fibs !! n
    where fibs = 0 : 1 : zipWith (+) fibs (tail fibs)

Here the memoization is only done within one call to fib, whereas if you leave fibs at the top level

fib n = fibs !! n

fibs = 0 : 1 : zipWith (+) fibs (tail fibs)

then the memoized list is kept until the garbage collector can determine that there are no more ways to reach fibs from any part of your program.

hammar
  • 138,522
  • 17
  • 304
  • 385
  • +1 For the suggestive example - though I bet that in this case *actually* the compiler will see that fibs does not need to be a local value and lift it silently to the top. – Ingo May 05 '11 at 13:45
  • 2
    @Ingo: I was under the impression that the compiler _does not_ float a let outside a lambda, precisely because this can have major consequences for memory usage, but I might be wrong. – hammar May 05 '11 at 13:57
  • @Ingo: Seems there is an optimization called _full laziness_ which does this, but I'm not sure if it applies in this case. – hammar May 05 '11 at 14:19
  • 2
    Note that even in the case of the memoized fib above the fibs value may get garbage collected when the function fib is no longer reachable (ghc does this). – augustss May 05 '11 at 15:47
  • 1
    The word you guys are looking for is [CAF](http://www.haskell.org/haskellwiki/Constant_applicative_form), see also [this question](http://stackoverflow.com/questions/5032750/how-to-tell-whether-haskell-will-cache-a-result-or-recompute-it). – barsoap May 06 '11 at 19:23
  • 1
    Is this behavior mandated by the Haskell standard? Or, does it just so happen that GHC does things this way? – Paul Merrill Jun 02 '12 at 04:29
5

My idea is that maybe memoized values are simply "scoped" to something in the program (for example to the current continuation, call stack, etc.), but I was unable to find something practical on the subject.

This is correct. Specifically, when you see something like:

fun x y = let
    a = .....
    b = ....
    c = ....
in ....

or an equivalent where clause, the values a, b and c may not be computed until actually used (or they may be computed right away because the strictness analyser can proove that the values would be evaluated later anyway). But when those values depend on current function parameters (here x and y), the runtime will most likely not remember every combination of x and y and the resulting a, b and c.

Note also, that in a pure language, it is also okay to not remember the values at all - this is only practical insofar as memory is cheaper than CPU time.

So the answer to your question is: there is no such thing as lifetime of intermediary results in Haskell. All one can say is that the evaluated value will be there when needed.

Ingo
  • 36,037
  • 5
  • 53
  • 100