1

I'd like to merge a list of dictionaries with lists as values. Given

arr[0] = {'number':[1,2,3,4], 'alphabet':['a','b','c']}
arr[1] = {'number':[3,4], 'alphabet':['d','e']}
arr[2] = {'number':[6,7], 'alphabet':['e','f']}

the result I want would be

merge_arr = {'number':[1,2,3,4,3,4,6,7,], 'alphabet':['a','b','c','d','e','e','f']}

could you recommend any compact code?

pault
  • 41,343
  • 15
  • 107
  • 149
jiwon Seo
  • 23
  • 3

4 Answers4

2

If you know these are the only keys in the dict, you can hard code it. If it isn't so simple, show a complicated example.

from pprint import pprint


arr = [
    {
        'number':[1,2,3,4], 
        'alphabet':['a','b','c']
    },
    {
        'number':[3,4], 
        'alphabet':['d','e']
    },
    {
        'number':[6,7], 
        'alphabet':['e','f']
    }
]

merged_arr = {
    'number': [],
    'alphabet': []
}

for d in arr:
    merged_arr['number'].extend(d['number'])
    merged_arr['alphabet'].extend(d['alphabet'])

pprint(merged_arr)

Output:

{'alphabet': ['a', 'b', 'c', 'd', 'e', 'e', 'f'],
 'number': [1, 2, 3, 4, 3, 4, 6, 7]}
Diptangsu Goswami
  • 5,554
  • 3
  • 25
  • 36
1
arr = [{'number':[1,2,3,4], 'alphabet':['a','b','c']},{'number':[3,4], 'alphabet':['d','e']},{'number':[6,7], 'alphabet':['e','f']}]

dict = {}
for k in arr[0].keys():
    dict[k] = sum([dict[k] for dict in arr], [])
print (dict)

output:

{'number': [1, 2, 3, 4, 3, 4, 6, 7], 'alphabet': ['a', 'b', 'c', 'd', 'e', 'e', 'f']}
ncica
  • 7,015
  • 1
  • 15
  • 37
  • I looking for compact code or library/function. but I think it was best in python. – jiwon Seo Aug 04 '19 at 04:57
  • oo = {} for k in arr[0].keys(): oo[k] = [] for m in arr: oo[k] += [m[k]] and this is my solution and reference your code. thank you! – jiwon Seo Aug 04 '19 at 04:57
  • Using `sum` to concatenate lists is very inefficient. It's fine for small lists but this results in quadratic complexity for something that can be done in linear time.[why sum on lists is (sometimes) faster than itertools.chain?](https://stackoverflow.com/questions/41772054/why-sum-on-lists-is-sometimes-faster-than-itertools-chain) – pault Aug 08 '19 at 17:46
0

EDIT: As noted by @pault, the solution below is of quadratic complexity, and therefore not recommended for large lists. There are more optimal ways to go around it.

However if you’re looking for compactness and relative simplicity, keep reading.


If you want a more functional form, this two-liner will do:

arr = [{'number':[1,2,3,4], 'alphabet':['a','b','c']},{'number':[3,4], 'alphabet':['d','e']},{'number':[6,7], 'alphabet':['e','f']}]

keys = ['number', 'alphabet']
merge_arr = {key: reduce(list.__add__, [dict[key] for dict in arr]) for key in keys}

print arr

Outputs:

{'alphabet': ['a', 'b', 'c', 'd', 'e', 'e', 'f'], 'number': [1, 2, 3, 4, 3, 4, 6, 7]}

This won't merge recursively.

If you want it to work with arbitrary keys, not present in each dict, use:

keys = {k for k in dict.keys() for dict in arr}
merge_arr = {key: reduce(list.__add__, [dict.get(key, []) for dict in arr]) for key in keys}
hugo
  • 3,067
  • 2
  • 12
  • 22
  • 1
    Same [comment as on the accepted answer](https://stackoverflow.com/questions/57340332/how-do-you-combine-lists-of-multiple-dictionaries-in-python/57340477#comment101315869_57340477) - this is a really inefficient to concatenate lists. [why sum on lists is (sometimes) faster than itertools.chain?](https://stackoverflow.com/questions/41772054/why-sum-on-lists-is-sometimes-faster-than-itertools-chain) – pault Aug 08 '19 at 17:48
  • It fits OA's use case, and to be fair, they asked for « a compact code ». But you’re right, my solution is suboptimal and I can’t recommend it. You should post your solution as an answer. As for me, maybe I should delete mine? – hugo Aug 08 '19 at 19:29
  • I mean it works and the performance hit only really matters when the lists get big. Also, it's a lot easier to understand than `dict((k, list(chain.from_iterable(map(itemgetter(1), v)))) for k, v in groupby(chain.from_iterable(zip(*(sorted(d.items()) for d in arr))), itemgetter(0)))` (which I haven't really thought through to see if it's more efficient). – pault Aug 08 '19 at 19:31
0

Here is code that uses defaultdict to more easily collect the items. You could leave the result as a defaultdict but this version converts that to a regular dictionary. This code will work with any keys, and the keys in the various dictionaries can differ, as long as the values are lists. Therefore this answer is more general than the other answers given so far.

from collections import defaultdict

arr = [{'number':[1,2,3,4], 'alphabet':['a','b','c']},
       {'number':[3,4], 'alphabet':['d','e']},
       {'number':[6,7], 'alphabet':['e','f']},
]

merge_arr_default = defaultdict(list)
for adict in arr:
    for key, value in adict.items():
        merge_arr_default[key].extend(value)
merge_arr = dict(merge_arr_default)

print(merge_arr)

The printed result is

{'number': [1, 2, 3, 4, 3, 4, 6, 7], 'alphabet': ['a', 'b', 'c', 'd', 'e', 'e', 'f']}
Rory Daulton
  • 21,934
  • 6
  • 42
  • 50