Why is the tail call optimization not used in this Haskell program?

Question

The following program blows the stack:

__find_first_occurrence :: (Eq b) => b -> [b] -> Int -> Int
__find_first_occurrence e [] i = -1
__find_first_occurrence e (x:xs) i
    | e == x = i
    | otherwise = __find_first_occurrence e xs (i + 1)

find_first_occurrence :: (Eq a) => a -> [a] -> Int
find_first_occurrence elem list =   
    __find_first_occurrence elem list 0

main = do
    let n = 1000000
    let idx = find_first_occurrence n [1..n]
    putStrLn (show idx)

Fails with

Stack space overflow: current size 8388608 bytes. Use `+RTS -Ksize -RTS' to increase it.

However, as far as I understand it, the possible recursive call to __find_first_occurrence is the last thing evaluated by __find_first_occurrence, hence tail call optimization should be possible to do.

TCO isn't all that useful in Haskell. Laziness usually takes care of that so you rarely need it. In this case, I think the problem is huge unevaulated thunk on `i`. Try to force `i` before doing recursive call. — Vitus, Mar 14 '12 at 17:08
I was wondering about the definition of the first pattern match in __find_first_occurrence, namely `__find_first_occurrence e l i =`. Is it possible that the problem is not reduced in that expression? (By the way, if you are looking for the first occurrence of something in a list, there's a built in function in Data.List: http://hackage.haskell.org/packages/archive/base/latest/doc/html/Data-List.html#v:elemIndex , but I suppose that's not the point of the question:) ) — Christoffer, Mar 14 '12 at 17:09
@Christoffer: That case shouldn't be there at all. Oh, I just checked and indeed, `i \`seq\` __find_first_occurrence e xs (i + 1)` solves it. — Vitus, Mar 14 '12 at 17:12
@Christoffer This is a training program ;-) Sorry about the 2nd line, it was there by mistake. — quant_dev, Mar 14 '12 at 17:17
@vitus - writing in the tail recursive style so you get TCO would be *very useful* for the function in question though, as it's answer doesn't want laziness. — stephen tetley, Mar 14 '12 at 17:47
I made a lazy evaluation version which doesn't use recursion: `find_first_occurrence elem list = (snd . head) [x | x <- zip list [0..], fst x == elem]` and it's much slower and uses much more memory. — quant_dev, Mar 14 '12 at 17:58
@stephen tetley: Of course, there are cases where TCO is invaluable (e.g. accumulator patters). What I meant is that _generally_ you don't need TCO. — Vitus, Mar 14 '12 at 18:04
Note that by having the first clause return -1, the strictness analyser sees that the accumulator is not necessarily needed, so it can't make GHC evaluate the `i+1` in each step. If you used `i` also in the failure case, say with `__find_first_occurrence e [] i = -(i+1)`, the strictness analyser would cause it to be evaluated in each step (when compiling with optimisations, of course). — Daniel Fischer, Mar 14 '12 at 18:42

score 15 · Accepted Answer · answered Mar 14 '12 at 17:12

15

The tail call optimization is used, but the (i+1) expressions get thunked, and evaluating them at the end causes the stack to overflow.

answered Mar 14 '12 at 17:12

augustss

22,884
5
56
93

Can you please explain a bit more on why the (i+i) expression would get evaluated at the end. Is it because of lazy evaluation? Thank you. – Gangadhar Mar 14 '12 at 17:27

Why is the tail call optimization not used in this Haskell program?

1 Answers1

Linked