Pythonic inverse dict non-unique mappings

Question

I have a dictionary like this:

dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}

and want the inverse like this:

dict2 = dict({1:['a','b','c'], 2:['a','b','c'], 3:['a','b'], 4:['b']})

Like these questions:

Inverse Dict in Python \\ In-place dictionary inversion in Python

But I want to do it with non-unique keys and I don't want in-place conversion. I have some code working, but I was wondering if there's a dictionary comprehension way of doing this.

from collections import defaultdict
dict2 = defaultdict(list)
for i in dict1:
    for j in dict1[i]:
        dict2[j].append(i)

I tried this, but it only works for unique mappings. By unique I mean something like "for each value, there is only one key under which the value is listed". So unique mapping: '1: [a], 2: [b], 3: [c] -> a: [1], b: [2], c: [3]' VS non-unique mapping '1: [a], 2: [a, b], 3: [b, c] -> a: [1, 2], b: [2, 3], c: [3]'

dict2 = {j: i for i in dict1 for j in dict1[i]}

I think it must be something like this:

dict2 = {j: [i for i in dict1 if j in dict1[i]] for j in dict1[i]} # I know this doesn't work

Besides it not working, it seems like a comprehension like this would be inefficient. Is there an efficient, one liner way of doing this?

It won't work for non-unique values, per definition, keys in Dictionaries or Hash-tables are __unique__ — oz123, Jan 27 '14 at 07:28
Python dictionaries don't support duplicate keys --> `http://stackoverflow.com/questions/10664856/make-dictionary-with-duplicate-keys-in-python` — Priyank Patel, Jan 27 '14 at 07:28
I guess my usage of "unique" is ambiguous. What I mean by "unique" is that if the original dictionary has a 1-1 mapping from key->value. By unique I meant something like "for each value, there is only one key under which the value is listed". So unique mapping: `'1: [a], 2: [b], 3: [c] -> a: [1], b: [2], c: [3]' vs '1: [a], 2: [a, b], 3: [b, c] -> a: [1, 2], b: [2, 3], c: [3]'` — dantiston, Jan 27 '14 at 08:07
"...it seems like a comprehension like this would be inefficient." It seems like you're trying to prematurely optimize your code. Chances are that using an explicit `for` loop will have no significant impact on the performance of your code. — Joel Cornett, Jan 27 '14 at 08:12
What exactly do you want to do with duplicates? Discard them? Group them in a list? If discard, how do you define which ones? — mhlester, Jan 27 '14 at 08:29

score 4 · Answer 1 · answered Jan 27 '14 at 07:33

4

Standard dict:

>>> dict2 = {}
>>> for key, values in dict1.items():
...     for value in values:
...             dict2.setdefault(value, []).append(key)
... 
>>> dict2
{1: ['a', 'c', 'b'], 2: ['a', 'c', 'b'], 3: ['a', 'b'], 4: ['b']}

defaultdict:

>>> dict2 = defaultdict(list)
>>> for key, values in dict1.items():
...     for value in values:
...             dict2[value].append(key)
... 
>>> dict2
{1: ['a', 'c', 'b'], 2: ['a', 'c', 'b'], 3: ['a', 'b'], 4: ['b']}

answered Jan 27 '14 at 07:33

mhlester

22,781
10
52
75

This is essentially the same as the code I posted, and it doesn't address either of my questions. – dantiston Jan 27 '14 at 08:13
This is giving precisely the output you said you want. If you want a different output, mock it up! – mhlester Jan 27 '14 at 08:17

dantiston · Accepted Answer · 2014-01-28T20:20:53.317

I figured out an answer based on Vroomfondel's answer:

dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
dict2 = {item: [key for key in dict1 if item in dict1[key]] for value in dict1.values() for item in value}

This isn't the fastest, but it is a one liner and it is not the slowest of the options presented!

from timeit import timeit

methods = [['Vroomfondel1', '''from collections import defaultdict
import itertools
dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
dict2 = defaultdict(list)
for k,v in itertools.chain.from_iterable([itertools.product(vals,key) for key,vals in dict1.items()]):
    dict2[k].append(v)'''],

['Vroomfondel2', '''from collections import defaultdict
import itertools
dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
dict2 = defaultdict(list)
[dict2[k].append(v) for k,v in itertools.chain.from_iterable([itertools.product(vals,key) for key,vals in dict1.items()])]'''],


['***Vroomfondel2 mod', '''dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
dict2 = {item: [key for key in dict1 if item in dict1[key]] for value in dict1.values() for item in value}'''],

['mhlester1', '''dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
dict2 = {}
for key, values in dict1.items():
    for value in values:
        dict2.setdefault(value, []).append(key)'''],

['mhlester1 mod', '''from collections import defaultdict
dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
dict2 = defaultdict(list)
for key, values in dict1.items():
    for value in values:
        dict2[value].append(key)'''],

['mhlester2', '''from collections import defaultdict
dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
dict2 = defaultdict(list)
for key, values in dict1.items():
    for value in values:
        dict2[value].append(key)'''],

['initial', '''from collections import defaultdict
dict1 = {'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}
dict2 = defaultdict(list)
for i in dict1:
    for j in dict1[i]:
        dict2[j].append(i)''']

]

for method in methods:
    print "% 15s" % (method[0]), '\t', timeit(method[1], number=10000)

prints out:

   Vroomfondel1     0.202519893646
   Vroomfondel2     0.164724111557
***Vroomfondel2 mod     0.114083051682
      mhlester1     0.0599339008331
  mhlester1 mod     0.091933965683
      mhlester2     0.0900268554688
        initial     0.0953099727631

if the first dict is : A={1:('a','b'), 2:('b','e','c'), 3:('a','f'), 4:('c','d'), 5:('d','e','f')} with dict2 = {item: [key for key in A if item in A[key]] for value in A.values() for item in value} print dict2 Finally it yields: {'a': [1, 3], 'c': [2, 4], 'b': [1, 2], 'e': [2, 5], 'd': [4, 5], 'f': [3, 5]} — Jean-Pat, May 27 '14 at 20:16

Vroomfondel · Answer 3 · 2014-01-29T21:35:48.967

As a one-liner (thanks to mhlesters input), but with so-so readability (and only working because the values in dict2 are mutable and thus setdefault returning a reference to them):

import itertools
[dict2.setdefault(k,[]).append(v) for k,v in itertools.chain.from_iterable([itertools.product(vals,[key]) for key,vals in dict1.items()])]

Or with a for loop:

import collections
import itertools
dict2=collections.defaultdict(list)
for k,v in itertools.chain.from_iterable([itertools.product(vals,[key]) for key,vals in dict1.items()]):
    dict2[k].append(v)

Pythonic inverse dict non-unique mappings

3 Answers3