A fairly canonical source on this question is Foldr Foldl Foldl' on the Haskell Wiki. In summary, depending on how strictly you can combine elements of the list and what the result of your fold is you may decide to choose either foldr
or foldl'
. It's rarely the right choice to choose foldl
.
Generally, this is a good example of how you have to keep in mind the laziness and strictness of your functions in order to compute efficiently in Haskell. In strict languages, tail-recursive definitions and TCO are the name of the game, but those kinds of definitions may be too "unproductive" (not lazy enough) for Haskell leading to the production of useless thunks and fewer opportunities for optimization.
When to choose foldr
If the operation that consumes the result of your fold can operate lazily and your combining function is non-strict in its right argument, then foldr
is usually the right choice. The quintessential example of this is the nonfold
. First we see that (:)
is non-strict on the right
head (1 : undefined)
1
Then here's nonfold
written using foldr
nonfoldr :: [a] -> [a]
nonfoldr = foldr (:) []
Since (:)
creates lists lazily, an expression like head . nonfoldr
can be very efficient, requiring just one folding step and forcing just the head of the input list.
head (nonfoldr [1,2,3])
head (foldr (:) [] [1,2,3])
head (1 : foldr (:) [] [2,3])
1
Short-circuiting
A very common place where laziness wins out is in short-circuiting computations. For instance, lookup :: Eq a => a -> [a] -> Bool
can be more productive by returning the moment it sees a match.
lookupr :: Eq a => a -> [a] -> Bool
lookupr x = foldr (\y inRest -> if x == y then True else inRest) False
The short-circuiting occurs because we discard isRest
in the first branch of the if
. The same thing implemented in foldl'
can't do that.
lookupl :: Eq a => a -> [a] -> Bool
lookupl x = foldl' (\wasHere y -> if wasHere then wasHere else x == y) False
lookupr 1 [1,2,3,4]
foldr fn False [1,2,3,4]
if 1 == 1 then True else (foldr fn False [2,3,4])
True
lookupl 1 [1,2,3,4]
foldl' fn False [1,2,3,4]
foldl' fn True [2,3,4]
foldl' fn True [3,4]
foldl' fn True [4]
foldl' fn True []
True
When to choose foldl'
If the consuming operation or the combining requires that the entire list is processed before it can proceed, then foldl'
is usually the right choice. Often the best check for this situation is to ask yourself whether your combining function is strict---if it's strict in the first argument then your whole list must be forced anyway. The quintessential example of this is sum
sum :: Num a => [a] -> a
sum = foldl' (+) 0
since (1 + 2)
cannot be reasonably consumed prior to actually doing the addition (Haskell isn't smart enough to know that 1 + 2 >= 1
without first evaluating 1 + 2
) then we don't get any benefit from using foldr
. Instead, we'll use the strict combining property of foldl'
to make sure that we evaluate things as eagerly as needed
sum [1,2,3]
foldl' (+) 0 [1,2,3]
foldl' (+) 1 [2,3]
foldl' (+) 3 [3]
foldl' (+) 6 []
6
Note that if we pick foldl
here we don't get quite the right result. While foldl
has the same associativity as foldl'
, it doesn't force the combining operation with seq
like foldl'
does.
sumWrong :: Num a => [a] -> a
sumWrong = foldl (+) 0
sumWrong [1,2,3]
foldl (+) 0 [1,2,3]
foldl (+) (0 + 1) [2,3]
foldl (+) ((0 + 1) + 2) [3]
foldl (+) (((0 + 1) + 2) + 3) []
(((0 + 1) + 2) + 3)
((1 + 2) + 3)
(3 + 3)
6
What happens when we choose wrong?
We get extra, useless thunks (space leak) if we choose foldr
or foldl
when in foldl'
sweet spot and we get extra, useless evaluation (time leak) if we choose foldl'
when foldr
would have been a better choice.
nonfoldl :: [a] -> [a]
nonfoldl = foldl (:) []
head (nonfoldl [1,2,3])
head (foldl (:) [] [1,2,3])
head (foldl (:) [1] [2,3])
head (foldl (:) [1,2] [3]) -- nonfoldr finished here, O(1)
head (foldl (:) [1,2,3] [])
head [1,2,3]
1 -- this is O(n)
sumR :: Num a => [a] -> a
sumR = foldr (+) 0
sumR [1,2,3]
foldr (+) 0 [1,2,3]
1 + foldr (+) 0 [2, 3] -- thunks begin
1 + (2 + foldr (+) 0 [3])
1 + (2 + (3 + foldr (+) 0)) -- O(n) thunks hanging about
1 + (2 + (3 + 0)))
1 + (2 + 3)
1 + 5
6 -- forced O(n) thunks