0

Say I have a dictionary like this

D = {'a': [1,2,3], 'b': [2,3,4], 'c': [3,4,5]}

For each unique element present across all the lists, I want to find the associated dict keys.

The wanted output is thus:

out = {1: ['a'], 2: ['a', 'b'], 3: ['a', 'b', 'c'], 4: ['b', 'c'], 5: ['c']}

How do I do this most efficiently?

EDIT: I need to do this for a large dict with ~100 keys and each list between 50-10000 elements

Emjora
  • 379
  • 2
  • 11
  • That question doesn't seem focused on efficiency, but yes, essentially the same problem. It seems that the most upvoted answer (not the accepted one) works. – Emjora Aug 25 '21 at 08:16

3 Answers3

5

You can use a collections.defaultdict:

from collections import defaultdict

out = defaultdict(list)

for k, vals in D.items():
    for val in vals:
        out[val].append(k)

out
# defaultdict(<class 'list'>, {1: ['a'], 2: ['a', 'b'], 3: ['a', 'b', 'c'], 4: ['b', 'c'], 5: ['c']})

Alternatively, use dict.setdefault:

out = {}

for k, vals in D.items():
    for val in vals:
        out.setdefault(val, []).append(k)
user2390182
  • 72,016
  • 6
  • 67
  • 89
1

You could try this dictionary comprehension with a list comprehension:

>>> {val: [k for k in D.keys() if val in D[k]] for v in D.values() for val in v}
{1: ['a'], 2: ['a', 'b'], 3: ['a', 'b', 'c'], 4: ['b', 'c'], 5: ['c']}
>>> 

Or you could try this for loop:

res = {}
for k, val in D.items():
    for v in val:
        if v in res:
            res[v].append(k)
        else:
            res[v] = [k]

print(res)

Output:

{1: ['a'], 2: ['a', 'b'], 3: ['a', 'b', 'c'], 4: ['b', 'c'], 5: ['c']}
Gilad Green
  • 36,708
  • 7
  • 61
  • 95
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
  • 3
    One-liner, but terrible algorithmically. You shouldn't iterate all keys *and* value lists for every single value list item in this data structure. – user2390182 Aug 25 '21 at 08:02
1

Use get function:

for i,j in D.items():
    for k in j:
        out[k] = out.get(k,[])+[i]
Nothing special
  • 415
  • 3
  • 8