1

I have been working on a problem that benefits a lot from catching results of my functions and in my research I came across this article. I am stunned at how simple is the core in "Memoization with recursion" section namely:

memoized_fib :: Int -> Integer
memoized_fib = (map fib [0 ..] !!)
   where fib 0 = 0
         fib 1 = 1
         fib n = memoized_fib (n-2) + memoized_fib (n-1)

I feel like I understand how it work but do correct me if I'm wrong - this function saves a list which is populated using same function.

What bothers me is that I don't understand why this works, originally I was under impression that once haskell evaluates a function it releases memory that was used to store variables inside this function, but here it seems that if part of the list was evaluated by one call of this function those values are still available to another call of the same function.

Just typing this up makes my head hurt, because I don't understand why value used in calculation of fib 2 should be available in calculation of fib 3 or better yest fib 100?

My gut feeling tells me that this behavior has two problems(I'm probably wrong but again not sure why):

  • purity of this function we are evaluating one call using variable that did not arrive from parameters of this function
  • memory leaks no longer sure when will haskell release memory from this list
Will Ness
  • 70,110
  • 9
  • 98
  • 181
sgp667
  • 1,797
  • 2
  • 20
  • 38

2 Answers2

2

I think it's easier to understand if you compare your definition to this:

not_memoized_fib :: Int -> Integer
not_memoized_fib m = map fib [0 ..] !! m
   where fib 0 = 0
         fib 1 = 1
         fib n = not_memoized_fib (n-2) + not_memoized_fib (n-1)

The definition above is essentially the same as yours, except that it takes an explicit argument m. It is a so-called eta-expansion of the previous function, and is semantically equivalent to it. Yet, operationally, this has drastically worse performance, since memoization here does not take place.

Why? Well, your function defines the list map fib [0..] before taking the (implicit) input parameter m, so there is only one list around, for all m we may pass later on as arguments. Instead, in not_memoized_fib we first take m as input, and then define the list, making the function create a list for every call to not_memoized_fib, destroying performance.

It is even easier to see if we use let and lambdas instead of where. Compare

memoized :: Int -> Integer
memoized = let
   list = map fib [0..]
   fib 0 = 0
   fib 1 = 1
   fib n = memoized (n-1) + memoized (n-2)
   in \m -> list !! m
   -- ^^ here we take m, after everything is defined,

with its let over lambda (*) code structure, to

not_memoized :: Int -> Integer
not_memoized = \m -> let
            -- ^^ here we take m, before everything is defined, so
            -- we define local bindings {list,fib} at every call
   list = map fib [0..]
   fib 0 = 0
   fib 1 = 1
   fib n = not_memoized (n-1) + not_memoized (n-2)
   in list !! m

with the let inside the lambda.

In the former case, there is only one list around, while in the latter there is one list for each call.


(*) a searchable term.

Will Ness
  • 70,110
  • 9
  • 98
  • 181
chi
  • 111,837
  • 3
  • 133
  • 218
  • BTW, [apparently](https://stackoverflow.com/a/11466959/849891), older GHC with -O2 would compile even the `not_memoized_fib` as a memoizing one. do we know for sure what the more current versions do? perhaps this is related to the CSE changes... – Will Ness Oct 10 '19 at 12:36
  • With GHC 8.6.4 and `-O2`, both `memoized_fib` and `not_memoized_fib` are equally fast. Switching to `last $ map fib [0..m]` makes it (extremely) slow. – K. A. Buhr Oct 14 '19 at 19:23
1

The list defined by map fib [0..] is defined as part of the definition of the function, rather than being created each time the function is called. Due to laziness, though, the list is only "realized" as necessary for any given call.

Say your first call is memoized_fib 10. This will cause the first 10 Fibonacci numbers to actually be computed and stored in memory, and they will stay in memory for the duration of the program. Subsequent calls with a smaller argument don't need to compute anything; subsequent calls with larger arguments need only compute those elements that occur later in the list than the largest existing element.

chepner
  • 497,756
  • 71
  • 530
  • 681