7

I am hoping someone can help me with a problem I'm stuck with. I have a large number of tuples (>500) that look like this:

(2,1,3,6)  
(1,2,5,5)  
(3,0,1,6)  
(10,1,1,4)  
(0,3,3,0)  

A snippet of my code reads:

sum1 = (A,B,C,D) # creates a tuple of sums of (A,B,C,D)  
mysum = map(sum, zip(A, B, C, D))
print(mysum)

I realize the above code is not correct. I am trying to find a way to add all the values A together, all the values of B together, all the values of C together, and all the values of D together using the zip function. For example, I would like to print something that looks like this:

Asum = 16  
Bsum = 7  
Csum = 13  
Dsum = 21  

Can anyone help please? Thanks very much for your time.

drbunsen
  • 10,139
  • 21
  • 66
  • 94

5 Answers5

16
>>> zip((1,2,3),(10,20,30),(100,200,300))
[(1, 10, 100), (2, 20, 200), (3, 30, 300)]

>>> [sum(x) for x in zip((1,2,3),(10,20,30),(100,200,300))]
[111, 222, 333]

To do this with an arbitrarily large set of tuples:

>>> myTuples = [(1,2,3), (10,20,30), (100,200,300)]
>>> [sum(x) for x in zip(*myTuples)]
[111, 222, 333]

sidenote: in python3, note that zip returns a lazy iterable, which you can always explicitly turn into a list like any other kind of iterable: list(zip(...))

(thanks to Seganku for catching mistake in examples in an edit which was thrice rejected by other editors)

Community
  • 1
  • 1
ninjagecko
  • 88,546
  • 24
  • 137
  • 145
  • 1
    `itertools.izip` is probably better suited for longer lists. – Felix Kling Apr 17 '11 at 12:57
  • When I zip((1,2,3),(10,20,30),(100,200,300)) I don't get [(1, 10, 100), (2, 20, 200), (3, 30, 300)]. I get . Any idea what is going wrong? – drbunsen Apr 17 '11 at 22:53
  • There is some difference in the way python2 handles zip compated to python3. I'm not sure why in python3 the above zip doesn't work? Any suggestions? – drbunsen Apr 17 '11 at 23:22
  • 1
    It works just fine; the zip object returns an iterable object (just like range does, or an initialized generator function). This is generally a good thing, because you don't have to compute every single value when you initialize it ("lazy is better"). You can always explicitly convert any such iterable to a list like so: list(zip((1,2),(3,4))), forcing the evaluation then-and-there. I have added a note about this in the question, thank you. – ninjagecko Apr 18 '11 at 17:42
  • @ninjagecko Could you please point me in the right direction to get the picture of the `*` in `*myTuples`? – Lerner Zhang Sep 23 '16 at 11:52
  • @lerneradams: By "get the picture" do you mean understand? That is the syntax for taking a list and 'applying' it as the arguments to a function. So `f(*[1,2,3])` means `f(1,2,3)`. There is a variant for keyword arguments: `f(**dict(x=1, y=2))` means `f(x=1, y=2)`. See http://stackoverflow.com/questions/3394835/args-and-kwargs and http://stackoverflow.com/questions/36901/what-does-double-star-and-star-do-for-python-parameters – ninjagecko Sep 23 '16 at 19:31
  • @ninjagecko Yeah, I meant comprehend. Thanks. I had read the official document which reads "with the * operator can be used to unzip a list" but I remained confused. Thanks very much. – Lerner Zhang Sep 24 '16 at 02:13
4
map(sum, zip(a, b, c, d, e))

The first call, to zip, inverts the lists - makes a list of all first elements, all second elements, etc.

The second call, map, calls its first argument sum on its second argument, those lists, returning a list of the results of sum - which happens to be the sums.

  • Thank you, I think this is what I need. I am still just a bit confused though. zip(a, b, c, d, e)... not sure what 'e' is? Isn't there only four numbers in each tuple (a,b,c,d)? When I run this in my script, I get a 'TypeError: zip argument #1 must support iteration.' Any ideas? I edited my original post to reflect what I did. – drbunsen Apr 17 '11 at 17:36
  • a, b, c, d, and e are the tuples you listed. Each is a tuple of four elements. –  Apr 17 '11 at 19:04
2

It your sets are all the same size and you are working in C Python, you should consider using numpy. You can do this in numpy like so:

In [5]: sets = ((2,1,3,6),
   ...:         (1,2,5,5),
   ...:         (3,0,1,6),
   ...:         (10,1,1,4),
   ...:         (0,3,3,0)  )*100

In [6]: import numpy as np

In [7]: A = np.array(sets)

In [8]: np.sum(A, axis=0)
Out[8]: array([1600,  700, 1300, 2100])

Numpy converts your values to an array and works with them efficiently using optimized LAPACK functions.

To compare performance, I profiled under two sets of assumptions. In the first, I assume that your data is stored such that importing into a Numpy array is efficient, so I didn't include the time necessary to convert sets to an array. I compared the performance of np.sum to [sum(x) for x in zip(*sets)]. Here are the timeit results for each case:

Excluding numpy Array Conversion: 0.0735958760122
Including numpy Array Conversion: 17.1435046214
Plain old zip: 0.660146750495

The conclusion is that numpy is faster if your input data can easily be imported with numpy.

Carl F.
  • 6,718
  • 3
  • 28
  • 41
0

If you have all that tuples in a list, then you could use reduce():

>>> list(reduce(lambda x,y: (i+j for i,j in zip(x,y)), tuples, [0]*4))
[16, 7, 13, 21]
Felix Kling
  • 795,719
  • 175
  • 1,089
  • 1,143
0
tuples  = [(2,1,3,6), (1,2,5,5),  (3,0,1,6), (10,1,1,4), (0,3,3,0)]
s = [sum(tup) for tup in zip(*tuples)]
Asum, Bsum, Csum, Dsum = s
Vasil
  • 36,468
  • 26
  • 90
  • 114