17

I have a list of dictionaries in the following format:

foo = [
    {'a': 'x', 'b': 'y', 'c': 'z'},
    {'a': 'j', 'c': 'z'}
]

I want to group this list of dictionaries into a single dictionary, like:

bar = {
    'a': ['x', 'j'],
    'b': ['y', None],
    'c': ['z', 'z']
}

What I've currently done is, looping through all the dicts in foo and create a list of keys and then looping again over the same to create bar. I wonder whether there is a simpler way to accomplish this. Can anyone help?

akhilsp
  • 1,063
  • 2
  • 13
  • 26

3 Answers3

34
bar = {
    k: [d.get(k) for d in foo]
    for k in set().union(*foo)
}

Things to google:

  • python list comprehension
  • python dict comprehension
  • python star
  • python dict get
  • python set union
Alex Hall
  • 34,833
  • 5
  • 57
  • 89
10

I am just going to complement Alex Hall solution here, so that it does not return a lot of "None" values:

def merge_dictionary_list(dict_list):
  return {
    k: [d.get(k) for d in dict_list if k in d] # explanation A
    for k in set().union(*dict_list) # explanation B
  }

Explanation:

  • The whole thing inside {} is a dictionary comprehension
  • Explanation A: Go through all elements in dictionary list and get values for current key k if the current dictionary (d) being evaluated actually has that key.

OBS: Without the if k in d expression there could be a bunch of None values appended to the arrays in case the list of dictionaries contains different types of keys.

  • Explanation B: gets all keys from list of dictionary and unite them distinctly by using set().union. After all we can only have distinct elements in set data structure.

If you want to do it the traditional way, just go with:

def merge_list_of_dictionaries(dict_list):
  new_dict = {}
  for d in dict_list:
    for d_key in d:
      if d_key not in new_dict:
        new_dict[d_key] = []
      new_dict[d_key].append(d[d_key])
  return new_dict

I think the first solution looks more elegant, but the second one is more legible/readable.

Kind Regards :)

Nayanexx.py
  • 121
  • 1
  • 5
4

I would do this in two steps:

  1. Collect all keys into a single iterable:

    >>> import operator
    >>> from functools import reduce
    >>> all_keys = reduce(operator.or_, (d.keys() for d in foo))
    >>> all_keys
    {'a', 'b', 'c'}
    
  2. Use a dict comprehension to create the desired result:

    >>> bar = {key: [d.get(key) for d in foo] for key in all_keys}
    >>> bar
    {'a': ['x', 'j'], 'b': ['y', None], 'c': ['z', 'z']}
    
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
Aran-Fey
  • 39,665
  • 11
  • 104
  • 149
  • 1
    good one. May I suggest `all_keys = reduce(operator.or_, map(dict.keys,foo))` ? – Jean-François Fabre Aug 12 '17 at 10:41
  • 1
    This is Python 3 only. – Alex Hall Aug 12 '17 at 10:44
  • @Jean-FrançoisFabre That's a viable alternative. The reason why I chose to avoid `map` is because it forces me to explicitly state the class - `dict.keys`. The generator expression on the other hand would also work with dict subclasses or any other objects that have a `.keys()` function. – Aran-Fey Aug 12 '17 at 10:46
  • @AlexHall Maybe. If the subclass has overriden the `keys` function, who knows what would happen. But let's drop this discussion; in the end the differences are so minor that it comes down to preference. – Aran-Fey Aug 12 '17 at 10:50
  • 2
    You could also do `set(chain.from_iterable(foo))` to get a set of all keys. – poke Aug 12 '17 at 10:51