112

I can't figure out why m1 is apparently memoized while m2 is not in the following:

m1      = ((filter odd [1..]) !!)

m2 n    = ((filter odd [1..]) !! n)

m1 10000000 takes about 1.5 seconds on the first call, and a fraction of that on subsequent calls (presumably it caches the list), whereas m2 10000000 always takes the same amount of time (rebuilding the list with each call). Any idea what's going on? Are there any rules of thumb as to if and when GHC will memoize a function? Thanks.

Pi Delport
  • 10,356
  • 3
  • 36
  • 50
Jordan
  • 1,121
  • 2
  • 8
  • 3

4 Answers4

121

GHC does not memoize functions.

It does, however, compute any given expression in the code at most once per time that its surrounding lambda-expression is entered, or at most once ever if it is at top level. Determining where the lambda-expressions are can be a little tricky when you use syntactic sugar like in your example, so let's convert these to equivalent desugared syntax:

m1' = (!!) (filter odd [1..])              -- NB: See below!
m2' = \n -> (!!) (filter odd [1..]) n

(Note: The Haskell 98 report actually describes a left operator section like (a %) as equivalent to \b -> (%) a b, but GHC desugars it to (%) a. These are technically different because they can be distinguished by seq. I think I might have submitted a GHC Trac ticket about this.)

Given this, you can see that in m1', the expression filter odd [1..] is not contained in any lambda-expression, so it will only be computed once per run of your program, while in m2', filter odd [1..] will be computed each time the lambda-expression is entered, i.e., on each call of m2'. That explains the difference in timing you are seeing.


Actually, some versions of GHC, with certain optimization options, will share more values than the above description indicates. This can be problematic in some situations. For example, consider the function

f = \x -> let y = [1..30000000] in foldl' (+) 0 (y ++ [x])

GHC might notice that y does not depend on x and rewrite the function to

f = let y = [1..30000000] in \x -> foldl' (+) 0 (y ++ [x])

In this case, the new version is much less efficient because it will have to read about 1 GB from memory where y is stored, while the original version would run in constant space and fit in the processor's cache. In fact, under GHC 6.12.1, the function f is almost twice as fast when compiled without optimizations than it is compiled with -O2.

Reid Barton
  • 14,951
  • 3
  • 39
  • 49
  • 1
    The cost to evaluate (filter odd [1..]) expression is close to zero anyway - it is lazy list after all, so the real cost is in (x !! 10000000) application when the list is actually evaluated. Besides, both m1 and m2 seems to be evaluated only once with -O2 and -O1 (on my ghc 6.12.3) at least within the following test: (test = m1 10000000 `seq` m1 10000000). There is a difference though when no optimization flag is specified. And both variants of your "f" have maximum residency of 5356 bytes regardless of optimization, by the way (with less total allocation when -O2 is used). – Ed'ka Oct 17 '10 at 10:41
  • 1
    @Ed'ka: Try this test program, with the above definition of `f`: `main = interact $ unlines . (show . map f . read) . lines`; compile with or without `-O2`; then `echo 1 | ./main`. If you write a test like `main = print (f 5)`, then `y` can be garbage collected as it is used and there is no difference between the two `f`s. – Reid Barton Oct 17 '10 at 14:44
  • er, that should be `map (show . f . read)`, of course. And now that I've downloaded GHC 6.12.3, I see the same results as in GHC 6.12.1. And yes, you are right about the original `m1` and `m2`: versions of GHC that perform this kind of lifting with optimizations enabled will transform `m2` into `m1`. – Reid Barton Oct 17 '10 at 15:36
  • Yes, now I see the difference (-O2 is definitely slower). Thank you for this example! – Ed'ka Oct 17 '10 at 17:42
31

m1 is computed only once because it is a Constant Applicative Form, while m2 is not a CAF, and so is computed for each evaluation.

See the GHC wiki on CAFs: http://www.haskell.org/haskellwiki/Constant_applicative_form

sclv
  • 38,665
  • 7
  • 99
  • 204
  • 1
    The explanation “m1 is computed only once because it is a Constant Applicative Form” does not make sense to me. Because presumably both m1 and m2 are top-level variables, I think that these _functions_ are computed only once, no matter they are CAFs or not. The difference is whether the list `[1 ..]` is computed only once during the execution of a program or it is computed once per application of the function, but is it related to CAF? – Tsuyoshi Ito Oct 17 '10 at 02:56
  • 1
    From the linked page: "A CAF ... can either be compiled to a piece of graph which will be shared by all uses or to some shared code which will overwrite itself with some graph the first time it is evaluated". Since `m1` is a CAF, the second applies and `filter odd [1..]` (not just `[1..]`!) is computed only once. GHC could also note that `m2` refers to `filter odd [1..]`, and place a link to the same thunk used in `m1`, but that would be a bad idea: it could lead to large memory leaks in some situations. – Alexey Romanov Oct 17 '10 at 03:33
  • @Alexey: Thank you for the correction about `[1..]` and `filter odd [1..]`. For the rest, I am still unconvinced. If I am not mistaken, CAF is only relevant when we want to argue that a compiler _could_ replace the `filter odd [1..]` in `m2` by a global thunk (which may be even the same thunk as the one used in `m1`). But in the asker’s situation, the compiler did _not_ do that “optimization,” and I cannot see its relevance to the question. – Tsuyoshi Ito Oct 17 '10 at 11:36
  • 2
    It is relevant that it can replace it _in_ `m1`, and it does. – Alexey Romanov Oct 17 '10 at 17:34
15

There is a crucial difference between the two forms: the monomorphism restriction applies to m1 but not m2, because m2 has explicitly given arguments. So m2's type is general but m1's is specific. The types they are assigned are:

m1 :: Int -> Integer
m2 :: (Integral a) => Int -> a

Most Haskell compilers and interpreters (all of them that I know of actually) do not memoize polymorphic structures, so m2's internal list is recreated every time it's called, where m1's is not.

mokus
  • 3,490
  • 16
  • 22
  • 1
    Playing with these in GHCi it seems it is also depending on the let-floating transformation (one of GHC's optimization passes that is not used in GHCi). And of course when compiling these simple functions, the optimizer is able to make them behave identically anyway (according to some criterion tests I ran anyway, with the functions in a separate module and marked with NOINLINE pragmas). Presumably that's because the list generation and indexing gets fused into a super tight loop anyway. – mokus Oct 18 '10 at 14:13
1

I'm not sure, because I'm quite new to Haskell myself, but it appears that it's beacuse the second function is parametrized and the first one is not. The nature of the function is that, it's result depends on input value and in functional paradigm especailly it depends ONLY on the input. Obvious implication is that a function with no parameters returns always the same value over and over, no matter what.

Aparently there's an optimizing mechanizm in GHC compiler that exploits this fact to compute the value of such a function only once for whole program runtime. It does it lazily, to be sure, but does it nonetheless. I noticed it myself, when I wrote the following function:

primes = filter isPrime [2..]
    where isPrime n = null [factor | factor <- [2..n-1], factor `divides` n]
        where f `divides` n = (n `mod` f) == 0

Then to test it, I entered GHCI and wrote: primes !! 1000. It took a few seconds, but finally I got the answer: 7927. Then I called primes !! 1001 and got the answer instantly. Similarly in an instant I got the result for take 1000 primes, because Haskell had to compute the whole thousand-element list to return 1001st element before.

Thus if you can write your function such that it takes no parameters, you probably want it. ;)

Sventimir
  • 1,996
  • 3
  • 14
  • 25