7

I want to know if there's a more Pythonic way of doing the following, perhaps using dictionary comprehensions:

A = some list
D = {}
for i,v in enumerate(A):
    if v in D:
        D[v].append(i)
    else:
        D[v] = [i]
lorenzocastillo
  • 985
  • 3
  • 13
  • 24

2 Answers2

12

Using defaultdict:

from collections import defaultdict
D = defaultdict(list)
[D[v].append(i) for i, v in enumerate(A)]

Using setdefault:

D = {}
[D.setdefault(v, []).append(i) for i, v in enumerate(A)]

I can't figure any mean to use a dictionnary comprehension without sorting the data:

from itertools import groupby
from operator import itemgetter
{v: ids for v, ids in groupby(enumerate(sorted(A)), itemgetter(1))}

Performances:

from collections import defaultdict
from itertools import groupby
from operator import itemgetter
from random import randint

A = tuple(randint(0, 100) for _ in range(1000))

def one():
    D = defaultdict(list)
    [D[v].append(i) for i, v in enumerate(A)]

def two():
    D = {}
    [D.setdefault(v, []).append(i) for i, v in enumerate(A)]

def three():
    {v: ids for v, ids in groupby(enumerate(sorted(A)), itemgetter(1))}


from timeit import timeit

for func in (one, two, three):
    print(func.__name__ + ':', timeit(func, number=1000))

Results (as always, the simplest win):

one: 0.25547646999984863
two: 0.3754340969971963
three: 0.5032370890003222
aluriak
  • 5,559
  • 2
  • 26
  • 39
  • 1
    Is it really Pythonic to use list-comprehensions merely as a means of doing work? I'll admit, I've run into that issue before, and considered list-comps as a way to get it done, but it feels hacky. Other than that, though, I agree with `defaultdict` as the clearest answer. – dwanderson Apr 11 '16 at 13:20
  • The other way is using `map`, but list comprehension is [much more readable](http://stackoverflow.com/a/1247490/3077939), and generally [preferred](http://www.u.arizona.edu/~erdmann/mse350/topics/list_comprehensions.html). – aluriak Apr 11 '16 at 13:23
1

You can do the following

d = collections.defaultdict(list)
for i,v in enumerate(A):
    d[v].append(i)

You can see that the values of the resulting dictionary are lists, the elements of which are to be produced while traversing. If you insist on doing a dict comp, you have to first find all the (value, [indices]), then do a dict comp on [(k,[v])], which just means extra acrobatics without any benefit.

C Panda
  • 3,297
  • 2
  • 11
  • 11