1

I don't quite understand how seq exactly works and how to implement a function using seq. So as an exercise, if I want to implement a function that generates a list starting with a number a.

for example, I'm using the following function

countup n = n : countup (n+1)

and trying to get [1, 2, 3, 4 ...] (starting from 1, just increment 1 and add to the list) as an infinite lazy list. How should I do this?

UPDATED:

I'm trying to make (take k (countup 0)) to use O(1) space. Here, take function is as follows:

take k (x:xl) = 
    if k==0
    then x
    else take (k-1) xl
Sod
  • 13
  • 4
  • 5
    You don’t need `seq` to do that. _Especially_ not if you want your list to be lazy! Just do `countup n = [n..]`. – bradrn Dec 21 '20 at 05:17
  • As for learning how `seq` works, here are some useful articles you might want to check out: http://dev.stephendiehl.com/hask/#laziness https://www.fpcomplete.com/haskell/tutorial/all-about-strictness/ https://wiki.haskell.org/Seq https://stackoverflow.com/questions/11046590/the-seq-function-and-strictness – bradrn Dec 21 '20 at 05:19
  • @bradrn I want to minimize the space usage using seq. So I implemented a function that takes a number (an index) and a list (an infinite list) that finds an element of an index of the infinite list. And I heard I can somehow minimize the space complexity to O(1) if I use seq to implement the list? – Sod Dec 21 '20 at 05:49
  • 3
    In general, space complexity is very hard to reason about in Haskell. If your function is using more space than expected, `seq` _can_ help reduce it, but only if you use it correctly. What function are you trying to use it with? – bradrn Dec 21 '20 at 06:23
  • 3
    @James: lazy functions can often also *reduce* space. Indeed, `[1..n]`, takes, if it remains unevaluated only O(1) space, it is only if you "consume" the list, that it expands, but even then not per se to *n* elements, so lazy evaluation can often *save* memory as well. – Willem Van Onsem Dec 21 '20 at 12:39
  • @bradrn I updated the code that I want to minimize the space complexity – Sod Dec 21 '20 at 19:05

1 Answers1

1
countup n = n : countup (n+1)

Each element of the list is created as a thunk previousElement + 1. So if you take the 1000000th or whatever element, that will be a very large thunk (...(n + 1)... + 1) containing ~1000000 suspensions. Even though the :-cells can be GC'd as soon as they are made (so traversing the list spine itself takes O(1) space), the elements themselves duplicate the structure of the list and so take k (countup n) still takes O(k) space.

We would like it if evaluating the : cells would also evaluate the elements. We can do that with

countup n = seq n $ n : countup (n + 1)

Now, when evaluating seq n $ n : countup (n + 1), seq will cause both n and n : countup (n + 1) to be evaluated. Evaluating the latter does nothing (it is already evaluated), and evaluating the former performs any thunked addition so that the + 1s never build up. With this definition, take k (countup n) takes O(1) space (or, really O(log(n + k))).

We can also write the improved function as

countup !n = n : countup (n + 1)
HTNW
  • 27,182
  • 1
  • 32
  • 60
  • I tried profiling both versions of `countup` as an experiment, but I can see no difference between them in either their memory usage or their speed. Are you sure that the implementation with `seq` take less space than the one without? – bradrn Dec 22 '20 at 02:22
  • 1
    Yes, quite sure. Compile `{-# LANGUAGE BangPatterns #-} module Main where import Prelude hiding (take); countUp !x = x : countUp (x + 1); take 0 (x : _) = x; take n (_ : xs) = take (n - 1) xs; main = print =<< (flip take (countUp 0) <$> readLn)` with `ghc -O`. Then run it with `./Main +RTS -s`. Give inputs `1000`, `1000000`, `10000000`: the output (for me) shows 2 MB heap usage. Now remove the `!` and recompile. Passing `1000` still uses ~2 MB. Passing `1000000` takes ~67 MB, and `10000000` ~670 MB. Ouch! Don't know what you did, but note "profiling" can give wrong results and is unnecessary. – HTNW Dec 22 '20 at 02:28
  • Yet another way to write it: ``(!:) :: a -> [a] -> [a]; a !: as = a `seq` a : as``, then `countup n = n !: countup (n + 1)`. – dfeuer Dec 22 '20 at 03:09
  • OK @HTNW, I’ll try that and see if it works. (Hmm… maybe our difference in results was because we used different GHC options? I compiled with `-prof` rather than `-O` so I could make a heap profile.) – bradrn Dec 22 '20 at 03:33
  • Thanks! I got the first countup version countup as countup n = n `seq` n : countup (n + 1) but I wasn't sure. And thanks for the additional information on how to check heap usage :) lovely. @HTNW. – Sod Dec 22 '20 at 03:36
  • @HTNW Nope, can’t replicate your results: I get 0.2 MB for 1000, 80 MB for 1000000, 800 MB for 10000000. (See https://pastebin.com/DBNiQMZK for output.) – bradrn Dec 22 '20 at 03:45
  • @bradrn You're looking at the wrong number. "Total allocations" is not a measure of memory use; it is a measure of *work*. It counts up on each allocated object but doesn't count down on deallocation. The *maximum actual memory use at a given time* is "total memory in use", which you see doesn't vary (shows 0, hah!). I.e. yes, the improved `countup` builds and burns through lots and lots of objects, but those objects are separated in time; they don't take up 800MB of actual memory (which is kind of the point: the `O(1)` behavior is because "skipped" cells are GC'd right after creation). – HTNW Dec 22 '20 at 04:16
  • @bradrn E.g. if I pass 1,000,000,000, I get ~80GB of "Total allocations"... the potato I'm running it on has 4GB of RAM. If you try the non-strict one, it actually *does* try to use up all those gobs and gobs of memory... Funnily, it seems like the total allocations is about twice the max memory usage in that case, which I think might correspond to there being one retained object (the `n + 1` thunk) per two objects total (the `:` cell and the thunk). – HTNW Dec 22 '20 at 04:39
  • @HTNW Thanks for the correction! Obviously I had no idea how to read that output… that’ll really help next time I need to figure out the heap usage of a program. – bradrn Dec 22 '20 at 04:52