What is the magic behind this lazy evaluation example?

Question

I want to challenge GHC compiler, so I wrote this code (the detail of the code is actually unimportant, only to show that some hard work has to be done to get each element of this infinite list):

hardwork :: [Int]
hardwork = step1 where
    -- start with value 1
    step1 = step2 1 
    -- force the hard work being done
    step2 i = step3 i $! step4 i
    -- construct the current result, start next 
    step3 i result = result : step2 (i+1)
    -- start the work with n=0 and j=1
    step4 i = step5 i 0 1
    -- if n=1048576, finish work and return j
    step5 _ 1048576 j = j
    -- otherwise increase n, and modify j
    step5 i n j = step5 i (n+1) ((j * i) `mod` 1048575)

Now I use the cleave function described in the Haskellwiki

cleave :: [a] -> ([a],[a])
cleave = step1 where
    step1 xs = (odds xs, evens xs)
    odds [] = []
    odds [x] = [x]
    odds (x:_:xs) = x : odds xs
    evens [] = []
    evens [x] = []
    evens (_:x:xs) = x : evens xs

and the main function

main :: IO ()
main = do
    print (take 5 (fst $ cleave hardwork), take 4 (snd $ cleave hardwork))

Like expected, it prints out values slowly, since it has to do very hard work to get the results. However what surprising me is, once the fist list being printed out, the second list was calculated immediately.

This is a surprise because, since the two occurrence of cleave hardwork seem to be unrelated in the code, and we are accessing the different part of them, it looks like a naive implementation will do the hard work again to get the second list. However GHC seems to be cleverer than I think.

My question is: how did they manage to do so? What is the magic behind this? More precisely, how the runtime figure out some requested value has already been evaluated (even if they were never being accessed)? Are there any cost for this kind of bookkeeping?

BTW, to ensure I am doing the right thing in the right way, I used a un-sugared, step by step style to define hardwork. There are of cause other ways to implement it but if it uses any sugars the behaviour may depend on the detail how the code being de-sugared by the compiler. Also, this step-by-step style makes paper evaluation by manually substitute expressions easier.

EDIT

So according to the answers, I rewrote hardwork to make it not a CAF (this is a more generic way to do so than the answer suggested anyway):

hardwork :: a -> [Int]
hardwork = step1 where
    step1 _ = step2 1
    ...

Now it results in the main running slow on both part of the result. But if I replace main with

print (take 5 $ fst value, take 6 $ snd value) where value = cleave hardwork()

It works the same way the first version do. So it looks like a prove of what the accepted answer said.

http://stackoverflow.com/questions/6090932/how-to-make-a-caf-not-a-caf-in-haskell — misterbee, Jan 13 '14 at 03:11

score 10 · Accepted Answer · answered Jan 13 '14 at 03:01

10

hardwork is a constant, defined at the top level of your program, so once it is computed once, its results are saved (just as if you had started main with let hardwork = ... in ...). If you wanted to compute it twice, you could define it as a function, and either ignore the first argument or use it as the seed, for example by changing the first few lines of hardwork to

hardwork :: Int -> [Int]
hardwork seed = step1 where
    step1 = step2 seed

Then if you call hardwork 1 twice, the same list will be recomputed each time.

answered Jan 13 '14 at 03:01

amalloy

89,153
8
140
205

So, what you mean here is, once `hardwork` being calculated up to nth element, all evaluated values will be stored within its thunk. Am I right? – Earth Engine Jan 13 '14 at 03:08
1

Sure, that's how evaluation works for everything in haskell. Once calculated, all values are saved indefinitely, until the garbage collector proves you can never access them again. – amalloy Jan 13 '14 at 03:12

score 5 · Answer 2 · edited May 23 '17 at 11:49

5

hardwork is a CAF, not a function, so its computation result doesn't get GC'd between accesses.

How to make a CAF not a CAF in Haskell?

edited May 23 '17 at 11:49

Community

1
1

answered Jan 13 '14 at 03:02

misterbee

5,142
3
25
34

What is the magic behind this lazy evaluation example?

2 Answers2