One liner to revert dictionary with non-unique values with O(n) complexity

Question

Is it possible to produce one liner (i.e. comprehension) with better time complexity than O(n²) as below?

my_map = {'A': 'x',
          'B': 'y',
          'C': 'x',
          'D': 'z'}

rev_map = {b: [a2 for a2 in my_map.keys() if my_map[a2] == b]
           for a, b in my_map.items()}

Have not found any in related Reverse / invert a dictionary mapping.

@alexpdev I see. Understood this is due to python strict evaluation. Surely there must be way around it with something meta... — aeiou, Jun 19 '22 at 08:50
if you're worried about speed, you can make it 1.5% faster by doing ```for b in my_map.values()``` rather than ```for a, b in my_map.items()``` — Nin17, Jun 19 '22 at 08:55
and 4.5% faster again by doing ```{b: [a2 for a2, b2 in my_map.items() if b2 == b] for b in my_map.values()}``` — Nin17, Jun 19 '22 at 09:00

score 2 · Answer 1 · answered Jun 19 '22 at 09:24

2

Here's an O(n log n) one-liner:

>>> {k : list(map(itemgetter(1), g)) for k, g in groupby(sorted(map(itemgetter(1, 0), my_map.items())), itemgetter(0))}
{'x': ['A', 'C'], 'y': ['B'], 'z': ['D']}

The required imports are:

from itertools import groupby from operator import item getter

answered Jun 19 '22 at 09:24

Raymond Hettinger

216,523
63
388
485

1

This definitely answers the OP's question and it's considerably faster than any other comprehension suggestions so far but still not as fast as a traditional/simplistic approach. It is also a fine example of when **not** to use one-liners. Interesting from an academic perspective but that's about all – DarkKnight Jun 19 '22 at 10:04

juanpa.arrivillaga · Accepted Answer · 2022-06-30T05:38:03.993

The issue isn't one-liners. The problem is that comprehensions are for expressing mapping/filtering operations. And you cannot get an O(N) implementation of this using map/filter. You need a reduce. Of course, to keep it on one line, with a lambda expression, it will be hacky and ugly:

And the one-liner:

reduce(lambda d, k: [d.setdefault(my_map[k], []).append(k), d][1], my_map, {})

In the REPL:

>>> my_map = {'A': 'x',
...           'B': 'y',
...           'C': 'x',
...           'D': 'z'}
>>> import functools
>>> functools.reduce(lambda d, k: [d.setdefault(my_map[k], []).append(k), d][1], my_map, {})
{'x': ['A', 'C'], 'y': ['B'], 'z': ['D']}

But please don't use this. Just write the regular for-loop.

DarkKnight · Answer 3 · 2022-06-19T09:50:37.623

I digress here because I don't understand why this has to be done as a one-liner. Consider this:

from timeit import timeit

my_map = {'A': 'x',
          'B': 'y',
          'C': 'x',
          'D': 'z'}

def func1():
    return {b: [a2 for a2, b2 in my_map.items() if b2 == b] for b in my_map.values()}

def func2():
    nm = {}
    for k, v in my_map.items():
        nm.setdefault(v, []).append(k)
    return nm

for func in func1, func2:
    print(func.__name__, timeit(lambda: func()))

Output:

func1 1.9566871839997475
func2 0.6075634010003341

Note that the dictionary comprehension takes more than 3 times as long as a more traditional/simplistic approach.

EDIT:

The time difference (factor) between the 2 methods increases significantly when the source dictionary is larger.

from timeit import timeit
import random
import string
my_map = {}

for k in string.ascii_uppercase:
    my_map[k] = random.choice('xyz')

print(my_map)

def func1():
    return {b: [a2 for a2, b2 in my_map.items() if b2 == b] for b in my_map.values()}

def func2():
    nm = {}
    for k, v in my_map.items():
        nm.setdefault(v, []).append(k)
    return nm

for func in func1, func2:
    print(func.__name__, timeit(lambda: func(), number=250_000))

Output:

{'A': 'y', 'B': 'x', 'C': 'x', 'D': 'y', 'E': 'y', 'F': 'y', 'G': 'z', 'H': 'x', 'I': 'y', 'J': 'z', 'K': 'z', 'L': 'y', 'M': 'x', 'N': 'x', 'O': 'z', 'P': 'x', 'Q': 'x', 'R': 'x', 'S': 'y', 'T': 'y', 'U': 'x', 'V': 'x', 'W': 'x', 'X': 'z', 'Y': 'z', 'Z': 'y'}
func1 8.602543555000011
func2 0.7367826070003503

Proof positive that one-liners are not always a great idea

Good and educative point! I do however still think python benefits from more capabilities to evaluate complex things efficiently in one line. Unless this is not in the scope of python... — aeiou, Jun 19 '22 at 09:54
@aeiou Comprehension in Python is indeed faster than standard for loops, but again as shown in this example algorithmic improvements can surpass by a large margin the speed gains of comprehension. — Gabriel Cia, Jun 19 '22 at 10:08
@GabrielCia You miss the point. Comprehensions can be significantly **slower** than simple (readable) loops — DarkKnight, Jun 19 '22 at 10:27
@GabrielCia not really, no. Comprehensions *are implemented as standard for loops*, with some minor optimizations, the largest being the caching of the resolution of the methods which can be reproduced in a regular loop — juanpa.arrivillaga, Jun 19 '22 at 11:07
@AlbertWinestein I meant in the general case when comparing equivalent code in standard for-loop vs comprehension, in which case comprehension will always be (slightly) faster as far as I am aware. What this example shows is that an O(n) algorithm without comprehension beats an O(n^2) algorithm that uses comprehension. — Gabriel Cia, Jun 19 '22 at 11:21

score -1 · Answer 4 · answered Jun 19 '22 at 09:03

-1

If I'm guessing correctly you would want to reverse your dictionary as so:

my_map = {
 'A': 'x',
 'B': 'y',
 'C': 'x',
 'D': 'z'
}
rev_map = {
 'x': ['A', 'C'],
 'y': ['B'],
 'z': ['Z'],
}

It can be done in a single pass and get you O(n) with the following code

from collections import defaultdict

rev_map = defaultdict(list)
for key, value in my_map.items():
    rev_map[value].append(key)

If you then want to transform keys with a single value in their list into a string then you can do another pass and do that, that would still be O(n).

answered Jun 19 '22 at 09:03

Gabriel Cia

388
5
13

Hey, I am looking for one liners / comprehension only. – aeiou Jun 19 '22 at 09:08
1

@aeiou Why? You need to bear in mind that one-liners can be inefficient and can become so convoluted that they're difficult to maintain. One-liners just for the sake of it are interesting but not necessarily worthwhile – DarkKnight Jun 19 '22 at 09:30
If speed is what you are worried about, you could just benchmark my code against yours and see if the algorithmic improvement from O(n^2) to O(n) beats dict comprehension vs no comprehension, I'd be curious to see the result. – Gabriel Cia Jun 19 '22 at 09:35

One liner to revert dictionary with non-unique values with O(n) complexity

4 Answers4