0

Given a Python dict of the form:

dict = {'Alice': 2341, 'Beth': 9102, 'Cecil': 3258, ......}

Is there an easy way to print the first x keys with the highest numeric values? That is, say:

Beth   9102
Cecil  3258

Currently this is my attempt:

max = 0
max_word = ""
for key, value in w.word_counts.iteritems():
    if value > max:
        if key not in stop_words:
            max = value
            max_word = key

print max_word
jscs
  • 63,694
  • 13
  • 151
  • 195
Superdooperhero
  • 7,584
  • 19
  • 83
  • 138
  • 1
    possible duplicate of [Python: Sort a dictionary by value](http://stackoverflow.com/questions/613183/python-sort-a-dictionary-by-value) – Zero Piraeus May 26 '14 at 22:52
  • 1
    You might consider using a [`Counter`](https://docs.python.org/2/library/collections.html#collections.Counter) instead of a dictionary initially. Then you have `word_counts.most_common(x)` – jscs May 26 '14 at 22:55

6 Answers6

7

I'd simply sort the items by the second value and then pick the first K elements :

d_items = sorted(d.items(), key=lambda x: -x[1])
print d_items[:2]
[('Beth', 9102), ('Cecil', 3258)]

The complexity of this approach is O(N log N + K), not that different from optimal O(N + K log K) (using QuickSelect and sorting just the first K elements).

Danstahr
  • 4,190
  • 22
  • 38
5

Using collections.Counter.most_common:

>>> from collections import Counter
>>> d = {'Alice': 2341, 'Beth': 9102, 'Cecil': 3258}
>>> c = Counter(d)
>>> c.most_common(2)
[('Beth', 9102), ('Cecil', 3258)]

It uses sorted (O(n*log n)), or heapq.nlargest(k) that might be faster than sorted if k << n, or max() if k==1.

jfs
  • 399,953
  • 195
  • 994
  • 1,670
3
>>> (sorted(dict.items(), key=lambda x:x[1]))[:2]
[('Alice', 2341), ('Cecil', 3258)]
Shan Valleru
  • 3,093
  • 1
  • 22
  • 21
1
items = sorted(w.word_counts.items(), lambda x, y: cmp(x[1], y[1]), None, True) 
items[:5]

Replace 5 with the number of elements you want to get.

Lachezar
  • 6,523
  • 3
  • 33
  • 34
1
d = {'Alice': 2341, 'Beth': 9102, 'Cecil': 3258}

vs = sorted(d, key=d.get,reverse=True)

l = [(x,d.get(x)) for x in vs[0:2]]
n [4]: l
Out[4]: [('Beth', 9102), ('Cecil', 3258)]
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
0

Convert dict to list of tuples [(2341, 'Alice'), ...] then sort it (without key=lambda ...).

furas
  • 134,197
  • 12
  • 106
  • 148