Why isn't this recursive function being optimized? (Haskell)

Question

I wrote my own 'sum' function in Haskell:

mySum [a] = a
mySum (a:as) = a + mySum as

And tested it with

main = putStrLn . show $ mySum [1 .. 400000000]

Only to receive a stack overflow error.

Using the Prelude's sum in the same manner:

main = putStrLn . show $ sum [1 .. 400000000]

I get no stack overflow.

It could be the huge list I'm evaluating, especially if the list passed to my function is being evaluated strictly, though my only reason for not suspecting this is that using the Prelude's sum with the same list I get no error.

This question is a duplicate of http://stackoverflow.com/questions/13042353/does-haskell-have-tail-recursive-optimization, so look there for the answer as well. — Rein Henrichs, May 27 '14 at 17:06

Don Stewart · Answer 1 · 2014-05-28T10:52:39.440

10

Your function fails with a stack overflow because it isn't tail recursive. One stack frame is consumed on each call holding onto the partial sum of 'a' at each step.

The Prelude's sum is implemented with a tail call:

sum     l       = sum' l 0
   where
    sum' []     a = a
    sum' (x:xs) a = sum' xs (a+x)

{-# SPECIALISE sum     :: [Int] -> Int #-}
{-# SPECIALISE sum     :: [Integer] -> Integer #-}
{-# INLINABLE sum #-}

which doesn't consume stack space.

Note that the specialize and inline pragmas are necessary to expose the strictness information that makes the 'a' accumulator safe to use without thunks accumulating. A more modern version would be:

    sum' []     !a = a
    sum' (x:xs) !a = sum' xs (a+x)

to make the strictness assumptions explicit. This is almost equivalent to the foldl' version wrt. strictness.

edited May 28 '14 at 10:52

answered May 27 '14 at 15:37

Don Stewart

137,316
36
365
468

1

Do you know why sum is implemented like that? To be more specific I would have thought `foldl' (+) 0` was the natural implementation. It also seem to work better in ghci where the actual sum overflows for me but this one doesn't? – monocell May 27 '14 at 16:03
5

There's no reason not to use `fold' (+)` these days, now that we have good inlining and specialization (and fusion for left folds). It wasn't always the case. – Don Stewart May 27 '14 at 16:07
http://hackage.haskell.org/package/base-4.7.0.0/docs/src/Data-List.html#sum The full definition includes a foldl – user3680400 May 27 '14 at 16:09
Note: `#ifdef USE_REPORT_PRELUDE` -- Haskell 98 didn't have `foldl'` – Don Stewart May 27 '14 at 16:12
1

You will absolutely get a stack overflow error using this `sum` unless you make `a` strict in `sum'`. – luqui May 27 '14 at 18:37
1

The library implementation certainly relies on strictness analysis and specialisation. It's better written with a bang on the accumulator. – Don Stewart May 27 '14 at 21:03

score 4 · Accepted Answer · answered May 27 '14 at 15:38

4

You're probably hoping that the compiler will perform tail call optimization on your method. Unfortunately, this definition of mySum is not tail-call optimizable. What is required for it to be is for the last function being called to be the recursive call, so in this case you'd want mySum to be the last function called. However, the last function being called in your definition is (+), not mySum. You could instead write it as @DonStewart has suggested, who managed to type out that solution before I was able to.

answered May 27 '14 at 15:38

bheklilr

53,530
6
107
163

1

it is TR modulo `(+)` though. Friedman and Wise described this transformation for the more involved [`(:)` case](http://en.wikipedia.org/wiki/Tail_call#Tail_recursion_modulo_cons) in their [TR19](http://www.cs.indiana.edu/cgi-bin/techreports/TRNNN.cgi?trnum=TR19) back in 1974, where they mention also "cumulative and associative constructors such as integer multiplication ... [as in] factorial" (`+` obviously also applies). – Will Ness May 28 '14 at 09:21

score 3 · Answer 3 · edited May 23 '17 at 10:32

Edit: I just realized that the question is a duplicate and I have basically reinvented its answer from Does Haskell have tail-recursive optimization?

GHC evaluates expressions using what is known as lazy evaluation. The most relevant feature of lazy evaluation for this discussion is what is known as "left-most, outermost evaluation" or normal-order evaluation. To see normal-order evaluation in action, let's follow the evaluation of two implementations of sum, a foldr implementation and a foldl implementation:

foldr (+) 0 (1:2:3:[])
1 + foldr (+) 0 (2:3:[])
1 + (2 + foldr (+) 0 (3:[]))
1 + (2 + (3 + foldr (+) 0 [])))
1 + (2 + (3 + 0))
1 + (2 + 3)
1 + 5
6

Note that because the recursive call to foldr is not left-most, outermost, lazy evaluation cannot reduce it. However, because (+) is strict in its second argument, the right hand side will be evaluated, leaving behind a chain of additions. Because the call to (+) is left-most, outermost, this implementation is similar to your implementation of sum.

One often hears that foldl is more efficient because of tail recursion, but is it so?

foldl (+) 0 (1:2:3:[])
foldl (+) (0+1) (2:3:[])
foldl (+) ((0+1)+2) (3:[])
foldl (+) (((0+1)+2)+3) []
((0+1)+2)+3
(1+2)+3
3+3
6

Note a few differences. First, the recursive call to foldl is left-most outermost and, because it is reduced by normal-order evaluation, does not take up any extra space on the stack. However, the calls to (+) are not reduced and do take up space on the stack. This should be enough to convince you that "tail recursion" is not sufficient to prevent space leaks in GHC.

So, we can use a tail position call to prevent the build-up of thunks representing calls to foldl (or sum' in Don's version), but how can we prevent the build-up of thunks for (+)? We can use strictness annotations, or let foldl' add them for us:

foldl' (+) 0 (1:2:3:[])
foldl' (+) 1 (2:3:[])
foldl' (+) 3 (3:[])
foldl' (+) 6 []
6

Note that this takes constant stack space and constant heap space.

In conclusion, if your recursive calls are left-most, outermost (which corresponds to tail position) they can be reduced by lazy evaluation. This is necessary but not sufficient in preventing the evaluation of your recursive function from using O(n) stack and heap space. foldl and foldr style recursion by themselves both take O(n) stack and heap space. foldl-style recursion with strictness annotations on accumulating parameters is necessary to get evaluation to operate in constant space.

there is no "build-up of thunks" in your `foldr` example: each new thunk created for the next tail is immediately consumed, forced by a strict `+` constructor. What that is, is the winding up of the stack. In `foldl` example there is indeed the build-up of (nested) thunks - the `{0+1}` referred to by the `{_+2}`, and that one by the next, `{_+3}`. Then when `[]` is reached, the outermost thunk is forced, which causes creation of the stack of summations `{_+3} = {_+2} + 3 = ({0+1} + 2) + 3 = (1 + 2) + 3 = ...`. -- and of course, `foldr` with non-strict combining function is O(1), in itself. :) — Will Ness, May 28 '14 at 10:00

Why isn't this recursive function being optimized? (Haskell)

3 Answers3