-2

I found the question How do I merge two dictionaries in a single expression (taking union of dictionaries)? where somebody wanted to merge dictionaries as union, but I want to merge dictionaries depending on the same "key" (where I don't know what key exactly) and still keep ALL information. When I tried the answer in the question I got the union but only the values of the 2nd dictionary where kept. That is not what I want. What I want is:

Lets say I have two dictionaries including lists which again includes dicts with different keys

myDict1 = [
  {'key1': 'list1'},
  {'key2': 'list2'},
  {'key3': 'list3'}
] 

myDict2 = [
  {'key4': 'list4'},
  {'key1': 'list5'},
  {'key3': 'list6'},
  {'key5': 'list7'}
]

now I want to merge the dictionaries where the keys have the same value, e.g., key1 but I don't know what key1 is so i can't use a criteria like =='key1'.

Resulting in (I use the "union_with_keys_and_values_from_both" to make sure that you understand what I want to achieve)

myMergedDict = [
  {'key1': 'union_with_keys_and_values_from_both(list1,list5)'},
  {'key2': 'list2'},
  {'key3': 'union_with_keys_and_values_from_both(list3,list6)'},
  {'key4': 'list7'},
  {'key5': 'list8'},
]

So every value of both dictionaries should be kept when merged.

Further remark:

lets say

list1 = (dict1)

list5 = (dict2, dict3)

dict1 = [
  {'key1': 'value1'},
  {'key2': 'value2'},
  {'key3': 'value3'}
] 

dict2 = [
  {'key4': 'value4'},
  {'key2': 'value5'},
  {'key5': 'value6'}
]

dict3 = [
  {'key6': 'value7'},
  {'key4': 'value8'},
  {'key7': 'value9'}
] 

Then union_with_keys_and_values_from_both(list1,list5) should result in

unionList1List5 = [
  {'key1': 'value1'},
  {'key2': ('value2', 'value5')},
  {'key3': 'value3'}
  {'key4': ('value4', 'value8')},
  {'key5': 'value6'}    
  {'key6': 'value7'},
  {'key7': 'value9'}
]
matlabalt
  • 40
  • 6

2 Answers2

2

If your union operation would be concatenate, this should do it:

a = {'a': 'foo', 'b':'bar', 'c': 'baz'}
b = {'a': 'spam', 'c':'ham', 'x': 'blah'}

r = dict(a.items() + b.items() +
    [(k, a[k] + b[k]) for k in set(b) & set(a)])

Which will give you this:

>>> a = {'a': 'foo', 'b':'bar', 'c': 'baz'}
>>> b = {'a': 'spam', 'c':'ham', 'x': 'blah'}
>>>
>>> r = dict(a.items() + b.items() +
...     [(k, a[k] + b[k]) for k in set(b) & set(a)])
>>> print(r)
{'a': 'foospam', 'x': 'blah', 'c': 'bazham', 'b': 'bar'}
>>>

https://stackoverflow.com/a/11012181/926014

You can basically apply any operation/function for the a[k] and b[k].

Example:

For tuple:

>>> r = dict(a.items() + b.items() + [(k, (a[k],b[k])) for k in set(b) & set(a)])
>>> print(r)
{'a': ('foo', 'spam'), 'x': 'blah', 'c': ('baz', 'ham'), 'b': 'bar'}

For list:

>>> r = dict(a.items() + b.items() + [(k, [a[k],b[k]]) for k in set(b) & set(a)])
>>> print(r)
{'a': ['foo', 'spam'], 'x': 'blah', 'c': ['baz', 'ham'], 'b': 'bar'}
Zatarra
  • 337
  • 3
  • 9
  • 1
    As long as by "union" the OP means "concatenate the strings"... – GPhilo Jun 28 '21 at 09:50
  • You can basically do any operation there.. between a elemend and b element with the same key – Zatarra Jun 28 '21 at 09:52
  • 1
    Of course, I would suggest you to make that explicit in your answer though, the OP seems a bit vague on what the union actually is and, as you said, it's fairly easy to replace the concatenation with any `union(a,b)` definition. – GPhilo Jun 28 '21 at 09:54
  • The question contains a list of dictionaries. But, you seem to have taken it differently. – Art Jun 28 '21 at 09:59
  • I edited my question so it is clear that I don't mean concatenate but make a list of it. – matlabalt Jun 28 '21 at 09:59
  • @matlabalt I added examples for tuple and for list – Zatarra Jun 28 '21 at 10:18
  • @Zatarra it doesn't work for me, it says *** TypeError: unsupported operand type(s) for +: 'dict_items' and 'dict_items' – matlabalt Jun 28 '21 at 11:18
0

Your question is somewhat confusingly worded, given that you're not actually attempting to take the union of two dictionaries. You've also labelled variables as dict1, MyDict1 and merged_dict, despite the fact that they're not actually dictionaries, but lists of dictionaries. Regardless, here's a solution to your problem that makes use of itertools.groupby (documentation here)

from itertools import groupby    

dict_list1 = [
  {'key1': 'dict1'},
  {'key2': 'dict2'},
  {'key3': 'dict3'}
] 

dict_list2 = [
  {'key4': 'dict4'},
  {'key1': 'dict5'},
  {'key3': 'dict6'},
  {'key5': 'dict7'}
]

def first_key_of_dict(x):
    return next(iter(x))

grouped = groupby(sorted(dict_list1 + dict_list2, key=first_key_of_dict), key=first_key_of_dict)

def get_result(k, tupled_g):
    if len(tupled_g) == 1:
        return tupled_g[0]
    return {k: tuple(x[k] for x in tupled_g)}

combined_list = [get_result(k, tuple(g)) for k, g in grouped]

print(combined_list)

Output:

[{'key1': ('dict1', 'dict5')}, {'key2': 'dict2'}, {'key3': ('dict3', 'dict6')}, {'key4': 'dict4'}, {'key5': 'dict7'}]
Alex Waygood
  • 6,304
  • 3
  • 24
  • 46