How to merge list of dictionaries in python in shortest and fastest way possible?

Question

I want to merge list of dictionaries in python. The number of dictionaries contained inside the list is not fixed and the nested dictionaries are being merged on both same and different keys. The dictionaries within the list do not contain nested dictionary. The values from same keys can be stored in a list.

My code is:

list_of_dict = [{'a': 1, 'b': 2, 'c': 3}, {'a': 3, 'b': 5}, {'k': 5, 'j': 5}, {'a': 3, 'k': 5, 'd': 4}, {'a': 3} ...... ]
output = {}

for i in list_of_dict:
    for k,v in i.items():
        if k in output:
            output[k].append(v)
        else:
            output[k] = [v]

Is there a shorter and faster way of implementing this?

I am actually trying to implement the most fast way of doing this because the list of dictionary is very large and then there are lots of rows with such data.

What is `list_of_dict.items()`? `list` doesn't have an `items()` method. — TheFungusAmongUs, Jan 19 '22 at 04:29
This question is being discussed on [meta](https://meta.stackoverflow.com/q/415544/16775594). — Sylvester Kruin, Jan 25 '22 at 16:15

score 4 · Answer 1 · answered Jan 19 '22 at 04:31

One way using collections.defaultdict:

from collections import defaultdict

res = defaultdict(list)

for d in list_of_dict:
    for k, v in d.items():
        res[k].append(v)

Output:

defaultdict(list,
            {'a': [1, 3, 3, 3],
             'b': [2, 5],
             'c': [3],
             'k': [5, 5],
             'j': [5],
             'd': [4]})

score 1 · Answer 2 · answered Jan 19 '22 at 04:30

items() is a dictionary method, but list_of_dict is a list. You need a nested loop so you can loop over the dictionaries and then loop over the items of each dictionary.

ou = {}
for d in list_of_dict:
    for key, value in d.items():
        output.setdefault(key, []).append(value)

Nishant Nawarkhede · Answer 3 · 2022-01-19T05:40:04.840

1

another shorten version can be,

list_of_dict = [{'a': 1, 'b': 2, 'c': 3}, {'a': 3, 'b': 5}, {'k': 5, 'j': 5}, {'a': 3, 'k': 5, 'd': 4}, {'a': 3}]

output = {
    k: [d[k] for d in list_of_dict if k in d]
    for k in set().union(*list_of_dict)
}
print(output)
{'d': [4], 'k': [5, 5], 'a': [1, 3, 3, 3], 'j': [5], 'c': [3], 'b': [2, 5]}

edited Jan 19 '22 at 05:40

answered Jan 19 '22 at 04:46

Nishant Nawarkhede

8,234
12
59
81

I was thinking along this line too, but do you think this would be faster than the straightforward solution of nested loop? (also `d.get(k)` can be replaced with `k in d`) – justhalf Jan 19 '22 at 04:49
I just tested it on 1000 rows of 100 entries each (average over 100 trials), seems like the nested loop version is faster by about 10% if using `d.get(k)`, but using `k in d` this approach is faster by 45%. – justhalf Jan 19 '22 at 05:05

score 0 · Answer 4 · answered Jul 20 '22 at 14:12

0

Python 3.9+ you can use the merge operator for this.

def merge_dicts(dicts):
    result = dict()
    for _dict in dicts:
        result |= _dict
    return result

answered Jul 20 '22 at 14:12

Dan Leonard

99
2
2

does not work for the question asked... – TheoryX Aug 25 '23 at 16:11

everestial007 · Accepted Answer · 2022-01-21T05:35:49.557

One of the shortest way would be to

prepare a list/set of all the keys from all the dictionaries
and call that key on all the dictionary in the list.

list_of_dict = [{'a': 1, 'b': 2, 'c': 3}, {'a': 3, 'b': 5}, {'k': 5, 'j': 5}, {'a': 3, 'k': 5, 'd': 4}, {'a': 3}]

# prepare a list/set of all the keys from all the dictionaries

# method 1: use sum 
all_keys = sum([[a for a in x.keys()] for x in list_of_dict], [])

# method 2: use itertools 
import itertools
all_keys = list(itertools.chain.from_iterable(list_of_dict))

# method 3: use union of the set
all_keys = set().union(*list_of_dict)

print(all_keys)
# ['a', 'b', 'c', 'a', 'b', 'k', 'j', 'a', 'k', 'd', 'a']

# convert the list to set to remove duplicates 
all_keys = set(all_keys)
print(all_keys)
# {'a', 'k', 'c', 'd', 'b', 'j'}

# now merge the dictionary
merged = {k: [d.get(k) for d in list_of_dict if k in d] for k in all_keys}
print(merged)
# {'a': [1, 3, 3, 3], 'k': [5, 5], 'c': [3], 'd': [4], 'b': [2, 5], 'j': [5]}

In short:

all_keys = set().union(*list_of_dict)
merged = {k: [d.get(k) for d in list_of_dict if k in d] for k in all_keys}

print(merged)
# {'a': [1, 3, 3, 3], 'k': [5, 5], 'c': [3], 'd': [4], 'b': [2, 5], 'j': [5]}

To get keys, would `set().union(*list_of_dicts))` be better? — justhalf, Jan 20 '22 at 06:02
For future readers, this answer had a deleted comment that was discussed [on this meta post](https://meta.stackoverflow.com/questions/415544/whats-happening-on-this-answer). — justhalf, Jan 27 '22 at 16:09

How to merge list of dictionaries in python in shortest and fastest way possible?

5 Answers5

Linked

Related