49

In Haskell, like in many other functional languages, the function foldl is defined such that, for example, foldl (-) 0 [1,2,3,4] = -10.

This is OK, because foldl (-) 0 [1, 2,3,4] is, by definition, ((((0 - 1) - 2) - 3) - 4).

But, in Racket, (foldl - 0 '(1 2 3 4)) is 2, because Racket "intelligently" calculates like this: (4 - (3 - (2 - (1 - 0)))), which indeed is 2.

Of course, if we define auxiliary function flip, like this:

(define (flip bin-fn)
  (lambda (x y)
    (bin-fn y x)))

then we could in Racket achieve the same behavior as in Haskell: instead of (foldl - 0 '(1 2 3 4)) we can write: (foldl (flip -) 0 '(1 2 3 4))

The question is: Why is foldl in racket defined in such an odd (nonstandard and nonintuitive) way, differently than in any other language?

Óscar López
  • 232,561
  • 37
  • 312
  • 386
Racket Noob
  • 1,056
  • 1
  • 8
  • 16
  • 1
    FWIW, Chez Scheme's `fold-left` is consistent with what you're expecting: `(fold-left - 0 '(1 2 3 4))` is `-10` and `(fold-left cons '() '(1 2 3 4))` is `((((() . 1) . 2) . 3) . 4)`. – erjiang Jan 13 '12 at 02:21

4 Answers4

70
  • The Haskell definition is not uniform. In Racket, the function to both folds have the same order of inputs, and therefore you can just replace foldl by foldr and get the same result. If you do that with the Haskell version you'd get a different result (usually) — and you can see this in the different types of the two.

    (In fact, I think that in order to do a proper comparison you should avoid these toy numeric examples where both of the type variables are integers.)

  • This has the nice byproduct where you're encouraged to choose either foldl or foldr according to their semantic differences. My guess is that with Haskell's order you're likely to choose according to the operation. You have a good example for this: you've used foldl because you want to subtract each number — and that's such an "obvious" choice that it's easy to overlook the fact that foldl is usually a bad choice in a lazy language.

  • Another difference is that the Haskell version is more limited than the Racket version in the usual way: it operates on exactly one input list, whereas Racket can accept any number of lists. This makes it more important to have a uniform argument order for the input function).

  • Finally, it is wrong to assume that Racket diverged from "many other functional languages", since folding is far from a new trick, and Racket has roots that are far older than Haskell (or these other languages). The question could therefore go the other way: why is Haskell's foldl defined in a strange way? (And no, (-) is not a good excuse.)

Historical update:

Since this seems to bother people again and again, I did a little bit of legwork. This is not definitive in any way, just my second-hand guessing. Feel free to edit this if you know more, or even better, email the relevant people and ask. Specifically, I don't know the dates where these decisions were made, so the following list is in rough order.

  • First there was Lisp, and no mention of "fold"ing of any kind. Instead, Lisp has reduce which is very non-uniform, especially if you consider its type. For example, :from-end is a keyword argument that determines whether it's a left or a right scan and it uses different accumulator functions which means that the accumulator type depends on that keyword. This is in addition to other hacks: usually the first value is taken from the list (unless you specify an :initial-value). Finally, if you don't specify an :initial-value, and the list is empty, it will actually apply the function on zero arguments to get a result.

    All of this means that reduce is usually used for what its name suggests: reducing a list of values into a single value, where the two types are usually the same. The conclusion here is that it's serving a kind of a similar purpose to folding, but it's not nearly as useful as the generic list iteration construct that you get with folding. I'm guessing that this means that there's no strong relation between reduce and the later fold operations.

  • The first relevant language that follows Lisp and has a proper fold is ML. The choice that was made there, as noted in newacct's answer below, was to go with the uniform types version (ie, what Racket uses).

  • The next reference is Bird & Wadler's ItFP (1988), which uses different types (as in Haskell). However, they note in the appendix that Miranda has the same type (as in Racket).

  • Miranda later on switched the argument order (ie, moved from the Racket order to the Haskell one). Specifically, that text says:

    WARNING - this definition of foldl differs from that in older versions of Miranda. The one here is the same as that in Bird and Wadler (1988). The old definition had the two args of `op' reversed.

  • Haskell took a lot of stuff from Miranda, including the different types. (But of course I don't know the dates so maybe the Miranda change was due to Haskell.) In any case, it's clear at this point that there was no consensus, hence the reversed question above holds.

  • OCaml went with the Haskell direction and uses different types

  • I'm guessing that "How to Design Programs" (aka HtDP) was written at roughly the same period, and they chose the same type. There is, however, no motivation or explanation — and in fact, after that exercise it's simply mentioned as one of the built-in functions.

    Racket's implementation of the fold operations was, of course, the "built-ins" that are mentioned here.

  • Then came SRFI-1, and the choice was to use the same-type version (as Racket). This decision was question by John David Stone, who points at a comment in the SRFI that says

    Note: MIT Scheme and Haskell flip F's arg order for their reduce and fold functions.

    Olin later addressed this: all he said was:

    Good point, but I want consistency between the two functions. state-value first: srfi-1, SML state-value last: Haskell

    Note in particular his use of state-value, which suggests a view where consistent types are a possibly more important point than operator order.

Sled
  • 18,541
  • 27
  • 119
  • 168
Eli Barzilay
  • 29,301
  • 3
  • 67
  • 110
  • 10
    For what it's worth, I think that the reasoning is thus: for cons lists, `foldr` is precisely the natural catamorphism, and thus its first argument has type `a -> b -> b` over a list with constructor `(:) :: a -> [a] -> [a]`. On the other hand, `foldl` is precisely the natural catamorphism for a *snoc* list, built in reverse, and so the first argument has `b -> a -> b` for the imaginary constructor `Snoc :: [a] -> a -> [a]`. – Antal Spector-Zabusky Jan 09 '12 at 04:18
  • 7
    "The Haskell definition is not uniform" -- but it is! There exists `x`, `y` such that `foldl (foldl f) x y` is well-typed for all `f` in Haskell, and the same holds true for `foldr (foldr f) x y`. _This is not true of Racket's foldl!_ Granted this doesn't matter much in Racket (not having currying) but currying is big in ML-derived languages so the order used in Haskell is important. Of course one could say foldl & foldr should both just use Haskell's foldr arg order. (BTW Eli -- I had a long discussion with SK once about this, it was one of the few times I was able to change his mind!) – Chris Pacejo Jan 09 '12 at 14:59
  • 6
    foldl is not "usually" a bad choice in a lazy language - especially if you employ it to compute a single number, string or such like. If you need that number, you need it, there is no way around evaluating the whole list, no matter if from the right or from the left. – Ingo Jan 09 '12 at 16:58
  • @ChrisK: Sure, I completely agree that currying is a major point in ML & Haskell. But the point is that these languages are generally much younger. (If that's what you convinced SK then I'm not surprised...) – Eli Barzilay Jan 09 '12 at 17:23
  • @Ingo: the problem is not just evaluating the whole list or not, it's the stack space that gets consumed in the process. – Eli Barzilay Jan 09 '12 at 17:24
  • 1
    @Eli - the definition of foldl is tail recursive in Haskell, so it doen't use any stack space. The problem is sometimes when foldl builds a large thunk that represents the result. When that thunk is evaluated after foldl is done, then stack overflows may happen. Use foldl' in such cases which needs constant stack space. Btw, the same problem with the unevaluated thuks can happen with foldr f where f is strict in its second argument. – Ingo Jan 09 '12 at 17:30
  • @Ingo: Yes, that's the problem I was talking about. I didn't know about `foldl'` though. (Unsurprisingly, it looks like it's addressing the problem by introducing strictness. I wonder how many people do the mistake of not knowing about the whole issue...) – Eli Barzilay Jan 09 '12 at 17:48
  • If two functions return the same result means they have *no* semantic difference, and you're only choosing between them for *operational* (efficiency) concerns. With Haskell's order you choose between them based on what you want to compute (and admittedly you then have to choose between `foldl` and `foldl'` for operational concerns). – Ben Jan 09 '12 at 23:41
  • 1
    @EliBarzilay since you've asked, about every last one of the newbies does this mistake with `foldl` and the thunks blow-up. – Will Ness Feb 04 '13 at 16:52
  • 1
    @Ingo if `f` in `foldr` is strict in 2nd arg, the arg's eval will only be triggered *after* entering `f`, so there won't be any thunks involved, only stack (which can of course then overflow (... just nitpicking)). – Will Ness Feb 04 '13 at 17:32
  • @WillNess What you say is correct. My formulation was misleading. The overall result for the user is, however, the same. – Ingo Feb 04 '13 at 18:15
  • 1
    @Ingo as I said, just nitpicking (for the sake of eternity). :) you said "unevaluated thunks" but you meant "stack overflow", obviously. It might be non-obvious for a newbie though, that's why I wanted to correct it. Nothing more aggravating for a newbie (speaking from experience here) than trying to understand not-quite-precisely-formulated statements. And, thanks for confirming this. :) – Will Ness Feb 04 '13 at 18:39
  • Scheme doesn't have fold, but SRFI-1 has (and Racket, but thats not Scheme). R6RS list library has fold-left and fold-right where the argument order is swapped like in haskell. – Sylwester Nov 05 '13 at 08:32
10

"differently than in any other language"

As a counter-example, Standard ML (ML is a very old and influential functional language)'s foldl also works this way: http://www.standardml.org/Basis/list.html#SIG:LIST.foldl:VAL

newacct
  • 119,665
  • 29
  • 163
  • 224
8

Racket's foldl and foldr (and also SRFI-1's fold and fold-right) have the property that

(foldr cons null lst) = lst
(foldl cons null lst) = (reverse lst)

I speculate the argument order was chosen for that reason.

Ryan Culpepper
  • 10,495
  • 4
  • 31
  • 30
3

From the Racket documentation, the description of foldl:

(foldl proc init lst ...+) → any/c

Two points of interest for your question are mentioned:

the input lsts are traversed from left to right

And

foldl processes the lsts in constant space

I'm gonna speculate on how the implementation for that might look like, with a single list for simplicity's sake:

(define (my-foldl proc init lst)
  (define (iter lst acc)
    (if (null? lst)
        acc
        (iter (cdr lst) (proc (car lst) acc))))
  (iter lst init))

As you can see, the requirements of left-to-right traversal and constant space are met (notice the tail recursion in iter), but the order of the arguments for proc was never specified in the description. Hence, the result of calling the above code would be:

(my-foldl - 0 '(1 2 3 4))
> 2

If we had specified the order of the arguments for proc in this way:

(proc acc (car lst))

Then the result would be:

(my-foldl - 0 '(1 2 3 4))
> -10

My point is, the documentation for foldl doesn't make any assumptions on the evaluation order of the arguments for proc, it only has to guarantee that constant space is used and that the elements in the list are evaluated from left to right.

As a side note, you can get the desired evaluation order for your expression by simply writing this:

(- 0 1 2 3 4)
> -10
Will Ness
  • 70,110
  • 9
  • 98
  • 181
Óscar López
  • 232,561
  • 37
  • 312
  • 386
  • 4
    Personally, I would much rather the description for a function was specific about the result it returns than how much stack space it uses! – Ben Jan 09 '12 at 23:31
  • 1
    @Ben [the doc does show an example application](http://docs.racket-lang.org/reference/pairs.html#%28def._%28%28lib._racket%2Fprivate%2Flist..rkt%29._foldl%29%29) of `> (foldl cons '() '(1 2 3 4))` ==> `'(4 3 2 1)` which can only be with one specific order of args. They could stress it some more, I agree. – Will Ness Feb 04 '13 at 17:31