3

I'm having difficulties in understanding Haskell lazy evaluation.

I wrote simple test program. It reads 4 lines of data and second and fourth input line has lots of numbers.

consumeList :: [Int] -> [Int] -> [Int]
consumeList [] _ = error "hi" -- to generate heap debug
consumeList (x:xs) y = consumeList xs y   
main = do
    inputdata <- getContents
    let (x:y:z:k:xs) = lines inputdata
        s = map (read ::String->Int) $ words $ k
        t = []
    print $ consumeList s t

words and map is performed on streams of characters lazily, this program use constant memory. constant memory usage

But As I add arguments t, situation is changed. My expectation is since t is map and words on lazy stream, and t is not used in consumeList, this change should not alter memory usage. But no.

consumeList :: [Int] -> [Int] -> [Int]
consumeList [] _ = error "hi" -- to generate heap debug
consumeList (x:xs) y = consumeList xs y
main = do
    inputdata <- getContents
    let (x:y:z:k:xs) = lines inputdata
        s = map (read ::String->Int) $ words $ k
        t = map (read ::String->Int) $ words $ y
    print $ consumeList s t    -- <-- t is not used

memory is increasing

Q1) Why does this program keeps allocating memory when t is not used at all?

I have another question. When I pattern match lazy stream with [,], not with (:) memory allocation behaviour is changed.

consumeList :: [Int] -> [Int] -> [Int]
consumeList [] _ = error "hi" -- to generate heap debug
consumeList (x:xs) y = consumeList xs y   
main = do
    inputdata <- getContents
    let [x,y,z,k] = lines inputdata    -- <---- changed from (x:y:..)
        s = map (read ::String->Int) $ words $ k
        t = []
    print $ consumeList s t

memory keeps increasing

Q2) are (:) and [,] different in terms of lazy evalutation?

Any comments are welcomed. Thanks

[EDIT]

Q3) Then, is it possible to process 4th line first and the process 2nd line, without increasing memory consumption?

The experiment guided by Derek is as follows. With switching y and k from second example, I got same result:

consumeList :: [Int] -> [Int] -> [Int]
consumeList [] _ = error "hi"
consumeList (x:xs) y = consumeList xs y
main = do
    inputdata <- getContents
    let (x:y:z:k:xs) = lines inputdata
        s = map (read ::String->Int) $ words $ y  -- <- swap with k
        t = map (read ::String->Int) $ words $ k  -- <- swap with y
    print $ consumeList s t

enter image description here

Chul-Woong Yang
  • 1,223
  • 10
  • 17

1 Answers1

5

To answer your first question, t is live as far as the garbage collector is concerned until you reach the end of the consumeList. That wouldn't be such a big deal since t would be a thunk pointing at work to do, but the problem here is that thunk is now keeping y alive and getContents has to actually read in y to get to k. In your first example, y could be garbage collected as it was being read in. (As an experiment, if you switched y and k in this example, I suspect you'd see behavior very much like your first example.)

For your second question, let [x,y,z,k] = ... means "(irrefutably) match a list of exactly four elements". That means when you force k it needs to (at that point) check that there are no further elements, which means it needs to read in all of the input corresponding to k before it can start processing it. In the earlier case, let (x:y:z:k:xs) = ... it can start processing k immediately because it doesn't have to first check whether xs is [] or (_:_).

Derek Elkins left SE
  • 2,079
  • 12
  • 14
  • I've get a bit of understanding. Thank you. Would you answer me some more questsions? (1) Then is it possible to process 4th line and then process 2nd line without increasing memory consumption? If mere holding a pointer of `y` from `(x:y:z:k:xs)` forces readiing and processing a stream, is it impossible? (2) Does graphs' `ghc-prim.GHC.Types:` indicates raw string read by `getContents` or list processed by `map . words`? I see the amount of memory is ~100MB, but actual input data is 5MB. so, I suspect the allocated memory might be `map . words` processed one. – Chul-Woong Yang Jan 22 '16 at 06:31
  • maybe you should start new questions - but usually it's a good idea to move away from lazy-IO in those cases (as you've seen it's a bit of a pain sometimes) – Random Dev Jan 22 '16 at 06:34
  • I see. I'll start a new question for the first question of above comment. – Chul-Woong Yang Jan 22 '16 at 06:37