174

What is the most idiomatic way to achieve something like the following, in Haskell:

foldl (+) 0 [1,2,3,4,5]
--> 15

Or its equivalent in Ruby:

[1,2,3,4,5].inject(0) {|m,x| m + x}
#> 15

Obviously, Python provides the reduce function, which is an implementation of fold, exactly as above, however, I was told that the 'pythonic' way of programming was to avoid lambda terms and higher-order functions, preferring list-comprehensions where possible. Therefore, is there a preferred way of folding a list, or list-like structure in Python that isn't the reduce function, or is reduce the idiomatic way of achieving this?

jamylak
  • 128,818
  • 30
  • 231
  • 230
mistertim
  • 5,123
  • 4
  • 20
  • 27
  • 3
    `sum` isn't good enough? – JBernardo Apr 28 '12 at 18:32
  • 3
    not sure if this is a good example for your question. It can easily be achieved with `sum`, you may want to provide some different types of examples. – jamylak Apr 28 '12 at 18:33
  • 22
    Hey JBernardo - Summing over a list of numbers was meant as a rather degenerate example, I'm more interested in the general idea of accumulating the elements of a list using some binary operation, and a starting value, not summing integers specifically. – mistertim Apr 28 '12 at 18:43
  • 1
    @mistertim: `sum()` actually provides limited functionality with this. `sum([[a], [b, c, d], [e, f]], [])` returns `[a, b, c, d, e, f]` for example. – Joel Cornett Apr 28 '12 at 19:34
  • Although the case of doing it with lists is a good demonstration of things to watch for with this technique - `+` on lists is a linear time operation in both time and memory, making the whole call quadratic. Using `list(itertools.chain.from_iterable([a], [b,c,d],[e,f],[]])` is linear overall - and if you only need to iterate over it once, you can drop the call to `list` to make it constant in terms of memory. – lvc May 21 '13 at 09:48
  • It should be noted foldLeft is generally something different from reduce... You can only reduce on monoids. foldLeft is a much more general thing. – dividebyzero Jan 27 '17 at 17:13

8 Answers8

159

The Pythonic way of summing an array is using sum. For other purposes, you can sometimes use some combination of reduce (from the functools module) and the operator module, e.g.:

def product(xs):
    return reduce(operator.mul, xs, 1)

Be aware that reduce is actually a foldl, in Haskell terms. There is no special syntax to perform folds, there's no builtin foldr, and actually using reduce with non-associative operators is considered bad style.

Using higher-order functions is quite pythonic; it makes good use of Python's principle that everything is an object, including functions and classes. You are right that lambdas are frowned upon by some Pythonistas, but mostly because they tend not to be very readable when they get complex.

goetz
  • 2,018
  • 2
  • 30
  • 33
Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • Higher-order functions may not always be pythonic. Otherwise why would `reduce` be removed from the Py3K builtin namespace? – JBernardo Apr 28 '12 at 18:41
  • 4
    @JBernardo: you're saying that anything not in the builtins module is not pythonic? – Fred Foo Apr 28 '12 at 18:42
  • 4
    No, that would be stupid to say. But give me a single reason why do you think [GvR would hate so much the reduce function](http://www.artima.com/weblogs/viewpost.jsp?thread=98196) at the point of removing it from builtins? – JBernardo Apr 28 '12 at 18:47
  • 7
    @JBernardo: because people try to play too smart tricks with it. To quote from that blog post, "the applicability of `reduce()` is pretty much limited to associative operators, and in all other cases it's better to write out the accumulation loop explicitly." So, its use is limited, but even GvR apparently had to admit its useful enough to keep it in the standard library. – Fred Foo Apr 28 '12 at 18:50
  • 21
    @JBernardo, so does that mean that every usage of fold in Haskell and Scheme is equally bad? It's just a different style of programming, ignoring it and putting your fingers in your ears and saying it's unclear doesn't make it so. Like most things that are a different style _it takes practice to get used to it_. The idea is to put things into general categories so it's easier to reason about programs. "Oh I want to do this, hmm, looks like a fold" (or a map, or an unfold, or an unfold then a fold over that) – Wes Apr 28 '12 at 20:00
  • 2
    @Wes: I was thinking primarily of functions passed to `reduce` or `map` that perform side-effects, or that take and return long tuples to perform too many computations at once. In Haskell, using `foldl` may sometimes the only reasonable option, but even in Scheme, there is a point where the loop gets so complicated that you want a named `let` instead. – Fred Foo Apr 28 '12 at 20:13
  • @larsmans: I personally try to avoid using lambdas in both Scheme and Haskell because I think it makes things less readable. On this point I agree 100% with pythonistas. – Wes Apr 28 '12 at 20:16
  • 4
    Lambda in Python can't contain more than one expression. You can't make it complex even if you try hard. So Pythonistas who don't like them are probably just not used to and hence don't like functional programming style. – golem Mar 11 '15 at 18:21
  • reduce is a specific case of foldl. reduce can only transform collection of `A`'s into specific `A`, while fold can generate almost anything – Sagi Sep 22 '22 at 19:26
74

Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), which gives the possibility to name the result of an expression, we can use a list comprehension to replicate what other languages call fold/foldleft/reduce operations:

Given a list, a reducing function and an accumulator:

items = [1, 2, 3, 4, 5]
f = lambda acc, x: acc * x
accumulator = 1

we can fold items with f in order to obtain the resulting accumulation:

[accumulator := f(accumulator, x) for x in items]
# accumulator = 120

or in a condensed formed:

acc = 1; [acc := acc * x for x in [1, 2, 3, 4, 5]]
# acc = 120

Note that this is actually also a "scanleft" operation as the result of the list comprehension represents the state of the accumulation at each step:

acc = 1
scanned = [acc := acc * x for x in [1, 2, 3, 4, 5]]
# scanned = [1, 2, 6, 24, 120]
# acc = 120
Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
  • 10
    this should be an accepted answer and not a suggestion to use `sum` – dark_ruby Dec 07 '21 at 17:01
  • 2
    (Attn: @mistertim) OK, so this is the actual answer. I am so sad that this not the top answer, and that many other pages say that the pythonic analog of reduce is, well, reduce() For the sake of Googlability: the pythonic analog of reduce is using := to introduce an accumulator into a list comprehension. Please change the "accepted answer" to this – Roman Kogan Dec 14 '21 at 08:25
  • 4
    This isn't truly a fold, though. As noted, it's a "scanleft" that's throwing away the temporary list after. For a long list, that difference could matter. – D0SBoots Jun 12 '22 at 10:27
  • 1
    Why this can matter: >>> timeit.Timer("acc=0\nfor i in range(10000):acc+=i").repeat(5, 100) [0.036283817142248154, 0.032444920390844345, 0.03235280327498913, 0.032462552189826965, 0.03250854276120663] >>> timeit.Timer("acc=0\n[acc:=acc+i for i in range(10000)]").repeat(5, 100) [0.04360721819102764, 0.03987771272659302, 0.03988228552043438, 0.03988049365580082, 0.039858657866716385] – D0SBoots Jun 12 '22 at 10:34
  • 1
    The difference in terms of memory usage is more important than the difference in timing. Oftentimes fold is over iterable objects that you cannot keep in memory. – Dan Ganea Oct 25 '22 at 16:35
23

Haskell

foldl (+) 0 [1,2,3,4,5]

Python

reduce(lambda a,b: a+b, [1,2,3,4,5], 0)

Obviously, that is a trivial example to illustrate a point. In Python you would just do sum([1,2,3,4,5]) and even Haskell purists would generally prefer sum [1,2,3,4,5].

For non-trivial scenarios when there is no obvious convenience function, the idiomatic pythonic approach is to explicitly write out the for loop and use mutable variable assignment instead of using reduce or a fold.

That is not at all the functional style, but that is the "pythonic" way. Python is not designed for functional purists. See how Python favors exceptions for flow control to see how non-functional idiomatic python is.

clay
  • 18,138
  • 28
  • 107
  • 192
  • 15
    folds are useful to more than functional "purists". They are general purpose abstractions. Recursive problems are pervasive in computing. Folds offer a way to remove the boilerplate and a way to make recursive solutions safe in languages which don't natively support recursion. So a very practical thing. GvR's prejudices in this area are unfortunate. – itsbruce Aug 03 '17 at 20:05
  • 6
    It's utterly bizarre to me that JavaScript has syntactically cleaner and more useful lambdas and higher-order functions than Python. This is really upsetting; Python is otherwise such a well-designed and attractive language. – iono Feb 05 '21 at 04:22
  • I agree with the bizarre part but not the "otherwise well designed..." I use the language as the primary for four years and heavy secondary for eight before that. At least there has been some improvements such as type hints, walrus operator, dict comprehensions. – WestCoastProjects May 03 '23 at 14:05
15

In Python 3, the reduce has been removed: Release notes. Nevertheless you can use the functools module

import operator, functools
def product(xs):
    return functools.reduce(operator.mul, xs, 1)

On the other hand, the documentation expresses preference towards for-loop instead of reduce, hence:

def product(xs):
    result = 1
    for i in xs:
        result *= i
    return result
Kyr
  • 5,383
  • 2
  • 27
  • 22
  • 11
    `reduce` wasn't removed from the Python 3 standard library. `reduce` moved to the `functools` module as you show. – clay Dec 11 '17 at 17:29
  • 1
    @clay, I just took the phrase from Guido's release notes, but you may be right :) – Kyr Dec 11 '17 at 18:17
  • i'm glad Guido is [mostly ?] gone. He professes to dislike functional programming and wanted to even give _lambda_ the boot. – WestCoastProjects May 03 '23 at 14:06
9

Not really answer to the question, but one-liners for foldl and foldr:

a = [8,3,4]

## Foldl
reduce(lambda x,y: x**y, a)
#68719476736

## Foldr
reduce(lambda x,y: y**x, a[::-1])
#14134776518227074636666380005943348126619871175004951664972849610340958208L
Mehdi Nellen
  • 8,486
  • 4
  • 33
  • 48
  • 10
    I think this is a better way to write your foldr: `reduce(lambda y, x: x**y, reversed(a))`. It now has a more natural usage, works with iterators, and consumes less memory. – Mateen Ulhaq Oct 20 '18 at 02:12
8

You can reinvent the wheel as well:

def fold(f, l, a):
    """
    f: the function to apply
    l: the list to fold
    a: the accumulator, who is also the 'zero' on the first call
    """ 
    return a if(len(l) == 0) else fold(f, l[1:], f(a, l[0]))

print "Sum:", fold(lambda x, y : x+y, [1,2,3,4,5], 0)

print "Any:", fold(lambda x, y : x or y, [False, True, False], False)

print "All:", fold(lambda x, y : x and y, [False, True, False], True)

# Prove that result can be of a different type of the list's elements
print "Count(x==True):", 
print fold(lambda x, y : x+1 if(y) else x, [False, True, True], 0)
Frédéric
  • 565
  • 1
  • 5
  • 10
  • You swap the arguments to `f` around in your recursive case. – KayEss Jun 20 '13 at 03:33
  • 11
    Because Python lacks tail recursion, this will break on longer lists and is wasteful. Furthermore, this is not truly the "fold" function, but merely a left fold, i.e. foldl, that is, *exactly* what `reduce` already offers (note that reduce's function signature is `reduce(function, sequence[, initial]) -> value` - it, too, includes the functionality of giving an initial value for the accumulator). – Colin Emonds Nov 10 '14 at 16:58
2

I believe some of the respondents of this question have missed the broader implication of the fold function as an abstract tool. Yes, sum can do the same thing for a list of integers, but this is a trivial case. fold is more generic. It is useful when you have a sequence of data structures of varying shape and want to cleanly express an aggregation. So instead of having to build up a for loop with an aggregate variable and manually recompute it each time, a fold function (or the Python version, which reduce appears to correspond to) allows the programmer to express the intent of the aggregation much more plainly by simply providing two things:

  • A default starting or "seed" value for the aggregation.
  • A function that takes the current value of the aggregation (starting with the "seed") and the next element in the list, and returns the next aggregation value.
rq_
  • 21
  • 1
  • 1
    Hi rq_! I think your answer would be improved and add a great deal if you gave a non-trivial example of `fold` that is difficult to do cleanly in Python, and then "`fold`" that in Python :-) – Scott Skiles Apr 04 '19 at 17:01
0

The actual answer to this (reduce) problem is: Just use a loop!

initial_value = 0
for x in the_list:
    initial_value += x #or any function.

This will be faster than a reduce and things like PyPy can optimize loops like that.

BTW, the sum case should be solved with the sum function

Nayuki
  • 17,911
  • 6
  • 53
  • 80
JBernardo
  • 32,262
  • 10
  • 90
  • 115
  • 8
    This would not be considered pythonic for an example such as this one. – jamylak Apr 28 '12 at 18:46
  • 9
    Python loops are notoriously slow. Using (or abusing) `reduce` is a common way of optimizing a Python program. – Fred Foo Apr 28 '12 at 18:48
  • 2
    @larsmans Please, don't come to say reduce is faster than a simple loop... It will have always a function call overhead for each iteration. Also, again, Pypy can optimize loops to C speed – JBernardo Apr 28 '12 at 18:50
  • 2
    @JBernardo: yes, that's what I'm claiming. I just profiled my version of `product` against one in your style, and *it's faster* (marginally, though). – Fred Foo Apr 28 '12 at 18:53
  • 2
    @JBernardo Assuming a builtin function (like `operator.add`) as argument to reduce: That extra call is a C call (which is much cheaper than a Python call), and it saves dispatching and interpreting a couple of bytecode instructions, which can easily cause dozens of function calls. –  Apr 28 '12 at 19:07
  • 1
    For any current readers, the preference is list comprehensions or generator expressions. They are faster than the loop above. Should speed be a concern, pypy will usually give a 3-5x speedup (not C speed I'm afraid) but numba and cython can both approach C speed with the right data structures. – John 9631 May 15 '17 at 07:04
  • 2
    @John9631 List comprehensions or generators would be a bad substitute for a general-purpose fold in pretty much any language, certainly Haskell and Python. Folds carry an accumulator, which is very bad style in Python comprehensions and impossible in Haskell. In Python, if you want accumulated state, the *orthodox* way to do it is a for loop, not a comprehension. – itsbruce Aug 03 '17 at 19:56
  • I very rarely downvote but this answer completely misses the point of _reduce_ and is incorrect in the performance claim. – WestCoastProjects May 03 '23 at 14:08