Computing the sum of all unique values in a numpy array containing rows of dicts

Question

I have a large numpy array, with each row containing a dict of words, in a similar format to below:

data = [{'a': 1, 'c': 2}, {'ba': 3, 'a': 4}, ... }

Could someone please point me in the right direction for how would I go about computing the sum of all the unique values of the dicts in each row of the numpy array? From the example above, I would hope to obtain something like this:

result = {'a': 5, 'c': 2, 'ba': 3, ...}

At the moment, the only way I can think to do it is iterating through each row of the data, and then each key of the dict, if a unique key is found then append it to the new dict and set the value, if a key that's already contained in the dict is found then add the value of that key to the key in the 'result'. Although this seems like an inefficient way to do it.

You might want to check out some of the ideas in here: https://stackoverflow.com/questions/16458340/python-equivalent-of-zip-for-dictionaries — Pablo Oliva, Nov 21 '17 at 19:41
Would it be possible to use [`Counter`s](https://docs.python.org/3/library/collections.html#collections.Counter) instead? Then this could be as simple as `sum(data, Counter())` — Patrick Haugh, Nov 21 '17 at 19:43
That looks like a list, not a numpy array. If it is indeed an array, why are you using an array to hold dicts? — Mad Physicist, Nov 21 '17 at 19:47

Reblochon Masque · Accepted Answer · 2017-11-21T19:54:26.800

3

You could use a Counter() and update it with each dictionary contained in data, in a loop:

from collections import Counter

data = [{'a': 1, 'c': 2}, {'ba': 3, 'a': 4}]
c = Counter()
for d in data:
    c.update(d)

output:

Counter({'a': 5, 'ba': 3, 'c': 2})

alternate one liner:

(as proposed by @AntonVBR in the comments)

sum((Counter(dict(x)) for x in data), Counter())

edited Nov 21 '17 at 19:54

answered Nov 21 '17 at 19:43

Reblochon Masque

35,405
10
55
80

Can be written in one line like this: `sum((Counter(dict(x)) for x in data),Counter())` according to https://stackoverflow.com/questions/11290092/python-elegantly-merge-dictionaries-with-sum-of-values – Anton vBR Nov 21 '17 at 19:52

score 2 · Answer 2 · answered Nov 21 '17 at 19:43

A pure Python solution using for-loops:

data = [{'a': 1, 'c': 2}, {'ba': 3, 'a': 4}]
result = {}
for d in data:
    for k, v in d.items():
        if k in result:
            result[k] += v
        else:
            result[k] = v

output:

{'c': 2, 'a': 5, 'ba': 3}

Computing the sum of all unique values in a numpy array containing rows of dicts

2 Answers2

output:

alternate one liner: