I'm trying to write a findIndexBy
which would return the index of an element selected in a list by an ordering function.
This function is equivalent to sorting the list and returning the top element, but I want to implement it to be able to process lists without size limit.
findIndexBy :: (Ord a) => (a -> a -> Bool) -> [a] -> Integer
findIndexBy f (x:xs) = findIndexBy' xs x 1 0
where
findIndexBy' [] _ _ i = i
findIndexBy' (x:xs) y xi yi = if f x y
then findIndexBy' xs x (xi + 1) xi
else findIndexBy' xs y (xi + 1) yi
With this implementation, I get a Stack space overflow
when processing big list, as in the following example (trivial):
findIndexBy (>) [1..1000000]
I know there should be more elegant solutions to solve this problem, and I'm interested in knowing the most idiomatic and efficient ones, but I really want to understand what is wrong with my function.
I might be wrong, but I think my implementation of findIndexBy'
is based on terminal recursion, so I don't really understand why the compiler doesn't seem to optimize the tail call.
I thought it might be due to the if/then/else and also trying the following, which results in the same error:
findIndexBy :: (Ord a) => (a -> a -> Bool) -> [a] -> Integer
findIndexBy f (x:xs) = findIndexBy' xs x 1 0
where
findIndexBy' [] _ _ i = i
findIndexBy' (x:xs) y xi yi = findIndexBy' xs (if f x y then x else y) (xi + 1) (if f x y then xi else yi)
Is there a simple way to ask the compiler to show where tail-call optimization is (not) performed?
For reference, below is the equivalent function that I wrote in Clojure, and that I am now trying to port to Haskell:
(defn index-of [keep-func, coll]
(loop [i 0
a (first coll)
l (rest coll)
keep-i i]
(if (empty? l)
keep-i
(let [keep (keep-func a (first l))]
(recur
(inc i) (if keep a (first l)) (rest l) (if keep keep-i (inc i)))))))
For information, the previously quoted Haskell code was compiled using the -O3
flag.
[edit after leventov answer]
The problem seems to be related to lazy evaluation.
Although I found about $!
and seq
, I wonder what is the best practice when using them to fix the original code.
I'm still interested with more idiomatic implementations relying on functions from Data.List
.
[edit]
The simplest fix is to add yi `seq`
in the first snippet before the if
statement.