4

Why does the 2nd parameter to the sum() can be an empty tuple? Shouldn't it be a number according to https://docs.python.org/3/library/functions.html#sum?

>>> tmp=((1,2), ('a','b'))
>>> sum(tmp, ())
(1, 2, 'a', 'b')
89f3a1c
  • 1,430
  • 1
  • 14
  • 24
Haibin Liu
  • 610
  • 1
  • 7
  • 18

3 Answers3

5

The 2nd parameter is the start value. This is not an index to start at, but a value to start the sum.

For example:

sum([1,2,3], 0) is the same as 0 + 1 + 2 + 3

sum([1,2,3], 6) is the same as 6 + 1 + 2 + 3

sum(((1,2), ('a','b')), ()) is the same as () + (1,2) + ('a','b')

Since start is 0 by default if you didn't specify a value for it you would get

0 + (1,2) + ('a','b')

Which gives

TypeError: unsupported operand type(s) for +: 'int' and 'tuple'

wilkben
  • 657
  • 3
  • 12
3

Short answer: because + (used by sum) can be redefined, you need someway of providing a type-appropriate starting value if you don't provide an iterable of numeric values.


The second argument is used as the "starting point" in the sum. It's basically a starting point for the sum:

>>> sum([1,2,3])
6
>>> sum([1,2,3], 0)
6
>>> sum([1,2,3], 2)
8

Its default value is 0, which causes problems if the sequence you want to sum isn't numeric, as 0 + (1, 2) isn't defined. Instead, you need to provide a value that can be added to elements of your sequence; while 0 is the identity for numeric addition, sum doesn't know what an equivalent value for tuple concatenation is, and you must provide it directly.

>>> sum(((1,2), ('a', 'b')))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'tuple'
>>> sum(((1,2), ('a', 'b')), ())
(1, 2, 'a', 'b')
>>> sum(((1,2), ('a', 'b')), (True,))
(True, 1, 2, 'a', 'b')
chepner
  • 497,756
  • 71
  • 530
  • 681
1

This answer explains the default parameter, but sum should never be used to flatten tuples or lists because of the quadratic aspect. See could sum be faster on lists

So:

Why does the 2nd parameter to the sum() can be an empty tuple? Shouldn't it be a number

Yes, it should be a number and sum should always apply to number elements if you want to keep that efficient. This default parameter is here to provide an alternative to 0 or 0.0.

Each time it encounters an item to sum, it doesn't perform in-place addition but something like (internally):

result = result + new_item

Which results in a O(n**2) complexity on list or tuple because old contents needs to be copied at each iteration. So don't do this (note that it's explicitly blocked for str type).

Instead, use a double flat comprenhension and create a tuple out of it:

tmp=((1,2), ('a','b'))

result = tuple(x for st in tmp for x in st)

If your tuple of tuples has a lot of elements, you'll see the speed difference.

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219