Fastest way of merging (while preserving values) multiple dictionaries in Python?

Question

I googled around for merging dictionaries but the results I looked at were all assuming replacement of the value. That is, if you a dict like {'config_prop': 2}, and another dict like {'config_prop': 7}, the end result of the merge is {'config_prop': 7}. What I want is {'config_prop': 9}.

My naive approach is the following, which works but is quite slow.

split_output = [{'some_prop': 1}, {'some_prop': 2, 'other_prop': 19}]
combined_output = {}
    for d in split_output:
        if combined_output == {}:
            combined_output = d.copy()
        else:
            for key, value in d.items():
                if key in combined_output:
                    combined_output[key] = combined_output[key] + value  # add to existing val
                else:
                    combined_output[key] = value

I'd love to hear suggestions on a better way to do this. Thanks!

Update: I tried this but it is considerably slower than my original code:

final_count = Counter() 
for d in split_output:
    final_count += Counter(d)   
final_output = dict(final_count)

Wes Doyle · Accepted Answer · 2021-10-19T13:12:31.607

I quickly profiled a few different approaches to your question:

combined_output = [{'some_prop': 1}, {'some_prop': 2, 'other_prop': 19}]

# set key or append value to new dictionary object:

def dict_sum(dicts):
    output = {}
    for d in dicts:
        for key, value in d.iteritems():
            if key in output:
                output[key] += value
            else:
                output[key] = value
     return output
    
# Implement a reducer function using functools:

from functools import reduce

def reducer(accumulator, element):
    for key, value in element.items():
        accumulator[key] = accumulator.get(key, 0) + value
    return accumulator
    
# Use a Counter:

from collections import Counter

def sum_dicts_values_by_key(dicts):
    return dict(sum([Counter(x) for x in combined_output], Counter()))

# Using dictionary comprehension:

def sum_dict_comprehension(dicts):
    return {k: sum([i[k] for i in dicts]) 
    if k in dicts[0] else i[k] for i in dicts for k in i}

Using timeit to run quick benchmarks for comparison:

res_1 = dict_sum(combined_output)
1000000 loops, best of 3: 457 ns per loop

res_2 = reduce(reducer, combined_output, {})
1000000 loops, best of 3: 1.35 µs per loop

res_3 = sum_dicts_values_by_key(combined_output)
100000 loops, best of 3: 12.8 µs per loop

res_4 = sum_dict_comprehension(combined_output)
1000000 loops, best of 3: 1.53 µs per loop

Thanks, Wes! Great answer. I wonder if dict size affects this at all. In my case I have two dicts each with a little over 400,000 key-value pairs. — Aaron, Oct 02 '18 at 14:39
There appears to be a typo in the `dict_sum` method, it will `return` after one dictionary. — joocer, Oct 17 '21 at 16:08

score 1 · Answer 2 · edited Oct 02 '18 at 00:41

1

You can use the following:

from collections import Counter

A = Counter({'config_prop': 2})
B = Counter({'config_prop': 7})
A + B

edited Oct 02 '18 at 00:41

David Zemens

53,033
11
81
130

answered Oct 02 '18 at 00:37

Irfanuddin

2,295
1
15
29

Hmmm... interesting idea but it is much slower than before. I updated the question with the new code I tried. – Aaron Oct 02 '18 at 00:52
I'm not really sure why. But [this](https://stackoverflow.com/questions/41594940/why-is-collections-counter-so-slow) claims that counter is fastest and explains what may slow down. Consider reading it. – Irfanuddin Oct 02 '18 at 00:58

vash_the_stampede · Answer 3 · 2018-10-02T02:14:19.587

0

Could use dictionary comprehension

combo = [{'some_prop': 1}, {'some_prop': 2, 'other_prop': 19}]

d = {k: sum([i[k] for i in combo]) if k in combo[0] else i[k] for i in combo for k in i}

{'some_prop': 3, 'other_prop': 19}

edited Oct 02 '18 at 02:14

answered Oct 02 '18 at 01:01

vash_the_stampede

4,590
1
8
20

Fastest way of merging (while preserving values) multiple dictionaries in Python?

3 Answers3