0

I googled around for merging dictionaries but the results I looked at were all assuming replacement of the value. That is, if you a dict like {'config_prop': 2}, and another dict like {'config_prop': 7}, the end result of the merge is {'config_prop': 7}. What I want is {'config_prop': 9}.

My naive approach is the following, which works but is quite slow.

split_output = [{'some_prop': 1}, {'some_prop': 2, 'other_prop': 19}]
combined_output = {}
    for d in split_output:
        if combined_output == {}:
            combined_output = d.copy()
        else:
            for key, value in d.items():
                if key in combined_output:
                    combined_output[key] = combined_output[key] + value  # add to existing val
                else:
                    combined_output[key] = value

I'd love to hear suggestions on a better way to do this. Thanks!

Update: I tried this but it is considerably slower than my original code:

final_count = Counter() 
for d in split_output:
    final_count += Counter(d)   
final_output = dict(final_count)
Aaron
  • 2,154
  • 5
  • 29
  • 42

3 Answers3

2

I quickly profiled a few different approaches to your question:

combined_output = [{'some_prop': 1}, {'some_prop': 2, 'other_prop': 19}]

# set key or append value to new dictionary object:

def dict_sum(dicts):
    output = {}
    for d in dicts:
        for key, value in d.iteritems():
            if key in output:
                output[key] += value
            else:
                output[key] = value
     return output
    
# Implement a reducer function using functools:

from functools import reduce

def reducer(accumulator, element):
    for key, value in element.items():
        accumulator[key] = accumulator.get(key, 0) + value
    return accumulator
    
# Use a Counter:

from collections import Counter

def sum_dicts_values_by_key(dicts):
    return dict(sum([Counter(x) for x in combined_output], Counter()))

# Using dictionary comprehension:

def sum_dict_comprehension(dicts):
    return {k: sum([i[k] for i in dicts]) 
    if k in dicts[0] else i[k] for i in dicts for k in i}

Using timeit to run quick benchmarks for comparison:

res_1 = dict_sum(combined_output)
1000000 loops, best of 3: 457 ns per loop

res_2 = reduce(reducer, combined_output, {})
1000000 loops, best of 3: 1.35 µs per loop

res_3 = sum_dicts_values_by_key(combined_output)
100000 loops, best of 3: 12.8 µs per loop

res_4 = sum_dict_comprehension(combined_output)
1000000 loops, best of 3: 1.53 µs per loop
Wes Doyle
  • 2,199
  • 3
  • 17
  • 32
  • 1
    added fourth method to profile dictionary comprehension – Wes Doyle Oct 02 '18 at 02:38
  • Thanks, Wes! Great answer. I wonder if dict size affects this at all. In my case I have two dicts each with a little over 400,000 key-value pairs. – Aaron Oct 02 '18 at 14:39
  • 1
    There appears to be a typo in the `dict_sum` method, it will `return` after one dictionary. – joocer Oct 17 '21 at 16:08
1

You can use the following:

from collections import Counter

A = Counter({'config_prop': 2})
B = Counter({'config_prop': 7})
A + B
David Zemens
  • 53,033
  • 11
  • 81
  • 130
Irfanuddin
  • 2,295
  • 1
  • 15
  • 29
  • Hmmm... interesting idea but it is much slower than before. I updated the question with the new code I tried. – Aaron Oct 02 '18 at 00:52
  • I'm not really sure why. But [this](https://stackoverflow.com/questions/41594940/why-is-collections-counter-so-slow) claims that counter is fastest and explains what may slow down. Consider reading it. – Irfanuddin Oct 02 '18 at 00:58
0

Could use dictionary comprehension

combo = [{'some_prop': 1}, {'some_prop': 2, 'other_prop': 19}]

d = {k: sum([i[k] for i in combo]) if k in combo[0] else i[k] for i in combo for k in i}
{'some_prop': 3, 'other_prop': 19}
vash_the_stampede
  • 4,590
  • 1
  • 8
  • 20