Why does python's sum() require start argument?

Question

Python's sum() function is really useful and generic except for its requrement for the start argument (sum(iterable, /, start=0)), which defaults to 0. With this, we can only sum iterables that support addition with integers. Otherwise, we are to define start explicitly. This affects readability and sometimes leads to unnecessary calculations.

Consider the following example: let's say we want to sum iterable of some nontrivial objects such as sympy's matrices (well, if one is not familiar with sympy they can treat sp.Matrix as object which supports summation but not with integers). The following code will fail:

import sympy as sp
a = sp.Matrix([1, 2])
b = sp.Matrix([3, 4])
sum([a, b]) #TypeError: cannot add <class 'sympy.matrices.dense.MutableDenseMatrix'> and <class 'int'>

How can we repair this? Let's try writing a specific function:

from typing import List

def sum_matrices(ms: List[sp.Matrix]) -> sp.Matrix:
    return sum(ms, 0*ms[0])

However, this only works because we can define some analogue of zero element. If we were unable to do that, it would be much easier to implement another generic sum() without a requrement for start rather than try finding different 'starts' every time. Moreover, multiplying by zero might be quite an expensive operation if size of matrices is huge. Creating a zero matrix from the beginning requires some lines of code. Though the aim of sum() is to reduce amount of boilerplate.

So, the questions are:

Why did they even think of this start argument? How can this be helpful at all? At least, they could've made it start=None and then check whether summation should start with the first element of iterable (if start == None) or with start otherwise.
What is the best way to deal with such situations?

The idea of built-in functions is to support the widest uses for all users. This is the designers' decisions. I am not very familiar with sympy, but have you tried `sum([a, b], start=sp.Matrix())`? — Tomerikoo, Mar 25 '20 at 11:51
1. Stack Overflow is not the right place to discuss the design decision of the people who invented the Python language. 2. `functools.reduce()`. — Klaus D., Mar 25 '20 at 11:53
Perhaps this [stack question](https://stackoverflow.com/questions/49292151/sum-of-a-list-of-sympy-matrices) helps you — Breno Fachini, Mar 25 '20 at 11:55
It was not designed to be flexible, if you check its documenation, it states "This function is intended specifically for use with numeric values and may reject non-numeric types." — juanpa.arrivillaga, Mar 25 '20 at 12:09
``sum`` *isn't* a generic "map `+` onto iterable" tool. It explicitly doesn't support ``str``, and even for ``float`` it is not generally recommended, as [explicitly noted in the docs](https://docs.python.org/3/library/functions.html#sum). If you have to worry whether your zero element is cheap enough, ``sum`` is not appropriate. — MisterMiyagi, Mar 25 '20 at 12:11

score 4 · Accepted Answer · answered Mar 25 '20 at 12:02

start exists because empty sums are important to support, and they cannot be supported without a start value. If there were no start value, that would simplify guaranteed-nonempty sums of unusual types, but at the cost of making possibly-empty sums much more awkward.

Possibly-empty sums are far more common than sum calls with unusual types, and most sum calls with unusual types are possibly-empty anyway, so they still need start.

score 3 · Answer 2 · answered Mar 25 '20 at 11:51

3

You can read a discussion with similar complaints here: https://mail.python.org/archives/list/python-ideas@python.org/thread/F7JFIQE7Q372AN7P6HO5QIGFNFBCXWES/
If you are confident that what you are summing is never empty, use:

functools.reduce(operator.add, ms)

answered Mar 25 '20 at 11:51

Alex Hall

34,833
5
57
89

Am I the only one who has this link broken? From what documentation says, it sounds like reduce is a good generic decision! – heinwol Mar 25 '20 at 12:19
@heinwol yes, the link works for me, even in private browsing – Alex Hall Mar 25 '20 at 12:20

score 2 · Answer 3 · answered Mar 25 '20 at 11:52

So you could sum iterables of an arbitrary type. There's no protocol that dictates you could use type(val)() as a safe default-value constructor for a value, so Python doesn't.
To know what you're summing so you know where to start= from, and/or come up with a function that can, for the scope of your program, figure out what a suitable start= value is.

Why does python's sum() require start argument?

3 Answers3