0

We have two lists of tuples of probably different length which look like this:

list1 = [(15339456, 140), (15340320, 412), (15341184, 364), (15342048, 488),
         (15342912, 272), (15343776, 350), (15344640, 301), (15345504, 159),
         (15346368, 224), (15347232, 241), (15348096, 223), (15348960, 175)]


list2 = [(15339456, 1516), (15341184, 2046), (15342048, 2400), (15342912, 8370),
         (15343776, 2112), (15344640, 1441), (15345504, 784),  (15346368, 1391)]

The first element of each tuple is the key and is unique in each list. We cannot assume that the key is existing in both lists. One list can have elements with keys that are not in the other. Now we want to sum up the second value of the tuple if its key is in both lists, otherwise we take the complete tuple.

Result:

[(15339456, 1656),
 (15340320, 412),
 (15341184, 2410),
 ...
]

Usually lists are summed up using zip like:

for tup1, tup2 in zip(list1, list2):
    sum_ = tup1[1] + tup2[1]
    lst.append((tup1[0], sum_))

That would work if both lists were of the same length and each key would exist in both lists, which is not the case.

Is there a nice way to build a condition in this for loop? Or probably a pythonic solution on this one? Two for loops and element wise comparing seems not quite satisfying.

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
Jürgen K.
  • 3,427
  • 9
  • 30
  • 66
  • So you first would iterate over one list, than over the other und check if its inside both sum up, otherwise insert? Seems poor to me – Jürgen K. Jun 29 '21 at 12:38
  • @mkrieger1 Why iterate the first time to add to an empty list? However, to sum up the values of the second list, I would still need to check if the key is inside result, I don't know any operation which does this for me. – Jürgen K. Jun 29 '21 at 12:41
  • If each key is unique in each list, then each list can be converted to a dictionary first and then the solutions from https://stackoverflow.com/questions/11011756/is-there-any-pythonic-way-to-combine-two-dicts-adding-values-for-keys-that-appe can be applied. – mkrieger1 Jun 29 '21 at 12:50

1 Answers1

2

An obvious solution is to create a result dictionary, then add all values from the first list and then all values from the second list:

from collections import defaultdict

result = defaultdict(int)
for key, value in list1:
    result[key] += value
for key, value in list2:
    result[key] += value

# convert dictionary-like to list of tuples if you want
result = list(result.items())

Using a dictionary as result spares you from doing a linear search to find the key to which to add a value (leading to an overall quadratic complexity), and a defaultdict in particular spares you from doing

if key not in result:
    result[key] = 0

to initialize the result before adding the first value.

You can generalize this to any number of input lists by using itertools.chain:

from collections import defaultdict
from itertools import chain

input_lists = [list1, list2]

result = defaultdict(int)
for key, value in chain.from_iterable(input_lists):
    result[key] += value

Visually there is now only one for loop but under the hoods it is doing the same.

mkrieger1
  • 19,194
  • 5
  • 54
  • 65