foldl versus foldr behavior with infinite lists

Question

The code for the myAny function in this question uses foldr. It stops processing an infinite list when the predicate is satisfied.

I rewrote it using foldl:

myAny :: (a -> Bool) -> [a] -> Bool
myAny p list = foldl step False list
   where
      step acc item = p item || acc

(Note that the arguments to the step function are correctly reversed.)

However, it no longer stops processing infinite lists.

I attempted to trace the function's execution as in Apocalisp's answer:

myAny even [1..]
foldl step False [1..]
step (foldl step False [2..]) 1
even 1 || (foldl step False [2..])
False  || (foldl step False [2..])
foldl step False [2..]
step (foldl step False [3..]) 2
even 2 || (foldl step False [3..])
True   || (foldl step False [3..])
True

However, this is not the way the function behaves. How is this wrong?

C. A. McCann · Accepted Answer · 2010-06-21T15:31:51.863

How folds differ seems to be a frequent source of confusion, so here's a more general overview:

Consider folding a list of n values [x1, x2, x3, x4 ... xn ] with some function f and seed z.

`foldl` is:

Left associative: f ( ... (f (f (f (f z x1) x2) x3) x4) ...) xn
Tail recursive: It iterates through the list, producing the value afterwards
Lazy: Nothing is evaluated until the result is needed
Backwards: foldl (flip (:)) [] reverses a list.

`foldr` is:

Right associative: f x1 (f x2 (f x3 (f x4 ... (f xn z) ... )))
Recursive into an argument: Each iteration applies f to the next value and the result of folding the rest of the list.
Lazy: Nothing is evaluated until the result is needed
Forwards: foldr (:) [] returns a list unchanged.

There's a slightly subtle point here that trips people up sometimes: Because foldl is backwards each application of f is added to the outside of the result; and because it is lazy, nothing is evaluated until the result is required. This means that to compute any part of the result, Haskell first iterates through the entire list constructing an expression of nested function applications, then evaluates the outermost function, evaluating its arguments as needed. If f always uses its first argument, this means Haskell has to recurse all the way down to the innermost term, then work backwards computing each application of f.

This is obviously a far cry from the efficient tail-recursion most functional programmers know and love!

In fact, even though foldl is technically tail-recursive, because the entire result expression is built before evaluating anything, foldl can cause a stack overflow!

On the other hand, consider foldr. It's also lazy, but because it runs forwards, each application of f is added to the inside of the result. So, to compute the result, Haskell constructs a single function application, the second argument of which is the rest of the folded list. If f is lazy in its second argument--a data constructor, for instance--the result will be incrementally lazy, with each step of the fold computed only when some part of the result that needs it is evaluated.

So we can see why foldr sometimes works on infinite lists when foldl doesn't: The former can lazily convert an infinite list into another lazy infinite data structure, whereas the latter must inspect the entire list to generate any part of the result. On the other hand, foldr with a function that needs both arguments immediately, such as (+), works (or rather, doesn't work) much like foldl, building a huge expression before evaluating it.

So the two important points to note are these:

foldr can transform one lazy recursive data structure into another.
Otherwise, lazy folds will crash with a stack overflow on large or infinite lists.

You may have noticed that it sounds like foldr can do everything foldl can, plus more. This is true! In fact, foldl is nearly useless!

But what if we want to produce a non-lazy result by folding over a large (but not infinite) list? For this, we want a strict fold, which the standard libraries thoughfully provide:

`foldl'` is:

Left associative: f ( ... (f (f (f (f z x1) x2) x3) x4) ...) xn
Tail recursive: It iterates through the list, producing the value afterwards
Strict: Each function application is evaluated along the way
Backwards: foldl' (flip (:)) [] reverses a list.

Because foldl' is strict, to compute the result Haskell will evaluate f at each step, instead of letting the left argument accumulate a huge, unevaluated expression. This gives us the usual, efficient tail recursion we want! In other words:

foldl' can fold large lists efficiently.
foldl' will hang in an infinite loop (not cause a stack overflow) on an infinite list.

The Haskell wiki has a page discussing this, as well.

I came here because I am curious why `foldr` is better than `foldl` in **Haskell**, while the opposite is true in **Erlang** (which I learned before **Haskell**). Since **Erlang** is not lazy and functions are not *curried*, so `foldl` in **Erlang** behaves like `foldl'` above. This is a great answer! Good job and thanks! — Siu Ching Pong -Asuka Kenji-, Mar 02 '14 at 14:53
This is mostly a great explanation, but I find the description of `foldl` as "backward" and `foldr` as "forward" problematic. This is in part because `flip` is being applied to `(:)` in the illustration of why fold is backward. The natural reaction is, "of course it's backward: you `flip`ped list concatenation!" It's also strange to see that called "backward" since `foldl` applies `f` to the first list element first (innermost) in a complete evaluation. It's `foldr` that "runs backward," applying `f` to the last element first. — Dave Abrahams, Nov 02 '14 at 00:59
@DaveAbrahams: Between just `foldl` and `foldr` and ignoring strictness and optimizations, first means "outermost", not "innermost". This is why `foldr` can process infinite lists and `foldl` can't--the right fold first applies `f` to the first list element and the (unevaluated) result of folding the tail, while the left fold must traverse the entire list to evaluate the outermost application of `f`. — C. A. McCann, Nov 10 '14 at 16:29
I'm just wondering if there is any instance where foldl would be preferred over foldl', do you think there's one? — kazuoua, Dec 30 '15 at 20:25
@kazuoua where the laziness is essential, e.g. `last xs = foldl (\a z-> z) undefined xs`. — Will Ness, Aug 13 '16 at 10:28
Great answer, but I think it's more clear to represent left associative of foldl as: `f (foldl f [] [x1, x2,... xn-1]) xn` and left associative of foldr as `f x1 (foldr f [] [x2, x3, ... xn])`. In this representation, it's clear that foldr can deal with infinite list. — Dagang, Jan 11 '17 at 01:39
Thanks for this explanation - I had trouble understanding why `foldr` worked better on infinite lists (if `foldr` calculates from the right, how does it work on infinite lists which don't have a defined right-hand side?) but this answer explains it well :) — nyctef, Jan 13 '17 at 16:34
@SiuChingPong-AsukaKenji- I was wondering if laziness is the sole reason behind the Haskell behavior and you exactly answered it. — dhu, Feb 23 '18 at 19:38
The notions of `start` and `end` or `first` and `last` would have been better terminology to the current `left` and `right` or even `top` and `bottom` since these notions pertain to which end of a list new elements `append`. — George, Mar 14 '21 at 04:21
I think the critical detail is that `foldl` is *not* `f (foldl f z xs) x` but rather `foldl f (f z x) xs`. If I'm not mistaken, the first definition is `foldl'`. — BallpointBen, Jul 23 '21 at 20:12
@George or 'head', 'last' as already used in the language. So foldh, foldl. Ok, 'from the head' might not be clear from the word head alone, while your suggestion of 'start' and 'first' induces the idea of direction almost as well as left right. For me, top bottom would be a dyslexya-frienfly disaster however — dawid, Jul 08 '22 at 00:23

Artelius · Answer 2 · 2010-06-21T06:18:28.690

28

myAny even [1..]
foldl step False [1..]
foldl step (step False 1) [2..]
foldl step (step (step False 1) 2) [3..]
foldl step (step (step (step False 1) 2) 3) [4..]

etc.

Intuitively, foldl is always on the "outside" or on the "left" so it gets expanded first. Ad infinitum.

edited Jun 21 '10 at 06:18

answered Jun 21 '10 at 05:54

Artelius

48,337
13
89
105

score 12 · Answer 3 · edited Mar 22 '16 at 19:14

12

You can see in Haskell's documentation here that foldl is tail-recursive and will never end if passed an infinite list, since it calls itself on the next parameter before returning a value...

edited Mar 22 '16 at 19:14

Matthias Braun

32,039
22
142
171

answered Jun 21 '10 at 05:38

Romain

12,679
3
41
54

2

So the "trick" is that if `f` can sometimes return just having seen the value of its first argument without needing to look at its second argument, then `foldr` gets to exit early (because with `foldr`, the second argument to `f` will be the result of another `foldr` call). Whereas `foldl` immediately recurses regardless of what `f` does. – BallpointBen Jul 23 '21 at 20:05

score 0 · Answer 4 · answered Jun 21 '10 at 05:40

0

I dont know Haskell, but in Scheme, fold-right will always 'act' on the last element of a list first. Thus is will not work for cyclic list (which is the same as an infinite one).

I am not sure if fold-right can be written tail-recursive, but for any cyclic list you should get a stack overflow. fold-left OTOH is normally implemented with tail recursion, and will just get stuck in an infinite loop, if not terminating it early.

answered Jun 21 '10 at 05:40

leppie

115,091
17
196
297

4

It's different in Haskell because of laziness. – Lifu Huang Aug 13 '17 at 21:36

foldl versus foldr behavior with infinite lists

4 Answers4

`foldl` is:

`foldr` is:

`foldl'` is:

Linked

Related

foldl versus foldr behavior with infinite lists

4 Answers4

foldl is:

foldr is:

foldl' is:

Linked

Related

`foldl` is:

`foldr` is:

`foldl'` is: