While thinking about this other question, I realized that the following functions smoothSeq
and smoothSeq'
smoothSeq :: (Integer, Integer, Integer) -> [Integer]
smoothSeq (a, b, c) = result
where
result = 1 : union timesA (union timesB timesC)
timesA = map (* a) $ result
timesB = map (* b) $ result
timesC = map (* c) $ result
smoothSeq' :: (Integer, Integer, Integer) -> [Integer]
smoothSeq' (a, b, c) = 1 : union timesA (union timesB timesC)
where
timesA = map (* a) $ smoothSeq' (a, b, c)
timesB = map (* b) $ smoothSeq' (a, b, c)
timesC = map (* c) $ smoothSeq' (a, b, c)
-- | Merge two sorted sequences discarding duplicates.
union :: (Ord a) => [a] -> [a] -> [a]
union [] ys = ys
union xs [] = xs
union (x : xs) (y : ys)
| x < y = x : union xs (y : ys)
| x > y = y : union (x : xs) ys
| otherwise = x : union xs ys
have drastically different performance characteristics:
ghci> smoothSeq (2,3,5) !! 500
944784
(0.01 secs, 311,048 bytes)
ghci> smoothSeq' (2,3,5) !! 500
944784
(11.53 secs, 3,745,885,224 bytes)
My impression is that smoothSeq
is linear in memory and time (as was regularSeq
) because result
is shared in the recursive definition, whereas smoothSeq'
is super-linear because the recursive function definition spawns a tree of computations that independently recompute multiple times the previous terms of the sequence (there is no sharing/memoization of the previous terms; similar to naive Fibonacci).
While looking around for a detailed explanation, I encountered these examples (and others)
fix f = x where x = f x
fix' f = f (fix f)
cycle xs = res where res = xs ++ res
cycle' xs = xs ++ cycle' xs
where again the non-primed version (without '
suffix) is apparently more efficient because it reuses the previous computation.
From what I can see, what differentiates the two versions is whether the recursion involves a function or data (more precisely, a function binding or a pattern binding). Is that enough to explain the difference in behavior? What is the principle behind, that dictates whether something is memoized or not? I couldn't find a definite and comprehensive answer in the Haskell 2010 Language Report, or elsewhere.
Edit: here is another simple example that I could think of:
arithSeq start step = result
where
result = start : map (+ step) result
arithSeq' start step = start : map (+ step) (arithSeq' start step)
ghci> arithSeq 10 100 !! 10000
1000010
(0.01 secs, 1,443,520 bytes)
ghci> arithSeq' 10 100 !! 10000
1000010
(1.30 secs, 5,900,741,048 bytes)
The naive recursive definition arithSeq'
is way worse than arithSeq
, where the recursion "happens on data".