Pythonic way to merge keys with common values for a single dictionary

Question

How do I merge the keys of a dictionary with common values into a tuple. For example:

A = {'E2': {'5', '7'}, 'E3': {'4', '8'}, 'E5': {'5', '7'}, 'E8': {'4', '8'}}

output = {('E2', 'E5'): {'5', '7'}, ('E3', 'E8'): {'4', '8'}}

My attempt:

A = {'E2': {'5', '7'}, 'E3': {'4', '8'}, 'E5': {'5', '7'}, 'E8': {'4', '8'}}

output = {}
seen = []
for k, v in A.items():
    if v not in [s[1] for s in seen]: # not seen this value yet
        print('NOT SEEN')
        print(k, v)
        seen.append([k,v])
        output[k] = v
    else: # already seen it 
        print('SEEN')
        print(k, v)
        # determine where we've seen it 
        where = [x for x in seen if x[1]==v]
        output.pop(where[0][0])
        output[(where[0][0], k)] = v


print('OUTPUT = ', output)

This prints:

OUTPUT =  {('E2', 'E5'): {'7', '5'}, ('E3', 'E8'): {'4', '8'}}

Okay, what stops you from doing this? Do you have a specific question? — vaultah, Jul 08 '17 at 21:52
We're saying that we would like to see some effort on your part. "give me the code" kind of questions are generally frowned upon. Especially when one or two loops should be enough to solve the problem. — Aran-Fey, Jul 08 '17 at 22:01
Possible duplicate of [How to merge two Python dictionaries in a single expression?](https://stackoverflow.com/questions/38987/how-to-merge-two-python-dictionaries-in-a-single-expression) — ApplePie, Jul 08 '17 at 22:08
Possible duplicate of [Merge Keys by common value from the same dictionary](https://stackoverflow.com/questions/44392023/merge-keys-by-common-value-from-the-same-dictionary) — Vinícius Figueiredo, Jul 08 '17 at 22:20
@vaultah Nothing, except the fact that my solution looks awful — Ryan J. Shrott, Jul 08 '17 at 22:21

Raymond Hettinger · Answer 1 · 2017-07-08T22:52:29.057

I would make the transformation in two passes:

>>> A = {'E2': {'5', '7'}, 'E3': {'4', '8'}, 'E5': {'5', '7'}, 'E8': {'4', '8'}}

# First pass:  Create a reverse one-to-many mapping. 
# The original set() value gets converted to a hashable frozenset()
# and used as a key.  The original scalar string key gets accumulated
# in a list to track the multiple occurrences.
>>> reverse = {}
>>> for key, value in A.items():
        reverse.setdefault(frozenset(value), []).append(key)

# Second pass:  reverse the keys and values.  The list of matching
# values gets converted to a hashable tuple (as specified by the OP)
# and the frozenset() gets restored back to the original set() type.
>>> {tuple(value) : set(key) for key, value in reverse.items()}
{('E2', 'E5'): {'5', '7'}, ('E3', 'E8'): {'8', '4'}}

This gives the output expected by the OP.

Note, the input dictionary doesn't have guaranteed order, nor do any of the sets in the original input. Accordingly, the output cannot have a guaranteed ordering of terms.

Kirill Gagarski · Answer 2 · 2017-07-08T23:33:03.517

import itertools

A = {'E2': {'5', '7'}, 'E3': {'4', '8'}, 'E5': {'5', '7'}, 'E8': {'4', '8'}}

def key(x): 
    # List supports ordering
    return sorted(list(x[1]))

def gen():
    for (group_key, group) in itertools.groupby(sorted(A.items(), key=key), key=key):
        gl = list(group)
        yield (tuple(x[0] for x in gl), 
               gl[0][1]  # group_key is a list, but we want original set
              )

print(dict(gen()))

If you are ready to convince yourself that set->list->set conversion is safe then you can make one-liner instead of generator:

print(dict((tuple(g[0] for g in group), set(group_key)) for 
           (group_key, group) in 
           itertools.groupby(sorted(A.items(), key=key), key=key)))

UPD: So, what exactly is going on here?

First of all we are converting dict to iterable of tuples by calling .items(). We want to group together items of this iterable which has the same second element (with index 1, or the former dict value). This is exactly what itertools.groupby does. The arguments is an iterable and key by which we will group. It would seem, key=lambda kv: kv[1] is way to go. Unfortunately not. We can compare sets for equality, but the docs say that iterable should be ordered. And sorted function requires key comparable for order. Sets cannot be compared for order by lists can. We can safely create a list that contain the same elements as set, but we should sort it (equal sets can produce lists with different order, {5, 7} == {7, 5}, but [5, 7] != [7, 5]).

Now, after sorting and grouping we have the following data structure:

[
   (key_dict_value as list, iterable of (dict_key, dict_value) that has dict_value == key_dict_value),
   ...
]

Now we can iterate over this iterable and create another iterable of tuples. We take second element (iterable, with index 1) of each tuple and convert it to a tuple (this is the key of our future dictionary). The value of our future dictionary is a value from original dictionary. We can take it either from some element of the second element of the tuple (this iterable cannot be empty since groupby cannot produce empty groups, see the first snippet) or from key_dict_value by converting it back to list (it is safe because this list was produced from the set, so it does not have equal elements, see the second snippet).

UPD2

While I was writing explanation I figured out that key for equality is not fine for sorted but fine for groupby, so here is even simpler solution without defining key function and converting list back to set:

print(dict((tuple(g[0] for g in group), group_key) for 
           (group_key, group) in itertools.groupby(sorted(A.items(), 
                                                          key=lambda x: sorted(list(x[1]))), 
                                                   key=lambda x: x[1])))

Ajax1234 · Accepted Answer · 2017-07-09T16:06:57.883

2

You can try this:

from collections import defaultdict

A = {'E2': {'5', '7'}, 'E3': {'4', '8'}, 'E5': {'5', '7'}, 'E8': {'4', '8'}}

second_new = defaultdict(list)

for a, b in A.items():
    second_new[tuple(b)].append(a)

final_dict = {tuple(b):set(a) for a, b in second_new.items()}

Output:

{('E8', 'E3'): {'8', '4'}, ('E5', 'E2'): {'5', '7'}}

edited Jul 09 '17 at 16:06

answered Jul 08 '17 at 23:49

Ajax1234

69,937
8
61
102

Thanks. This is the simplest and easy to understand. – Ryan J. Shrott Jul 09 '17 at 16:13
I am glad I could help. – Ajax1234 Jul 09 '17 at 16:15
gave you best answer – Ryan J. Shrott Jul 09 '17 at 16:16

cmaher · Answer 4 · 2017-07-09T00:17:52.323

Here's what I worked up using comprehensions. Only requires two intermediate steps and only uses built-in data types.

# get unique values from original dict
targ_values = set([tuple(v) for v in A.values()])

# build lists of original keys that match the temp_keys
targ_values = {targ_value:[orig_key for orig_key, orig_value in A.items() if tuple(orig_value) == targ_value] for targ_value in targ_values}

# reverse the order of keys & values and convert types to get desired output
output = {tuple(v):set(k) for k, v in targ_values.items()}

Pythonic way to merge keys with common values for a single dictionary

4 Answers4