Let's say I have a dictionary. I want to find the 4 keys with the highest values. I want to do this in a very basic way. I'm not that advanced with CS. I just want to iterate over the dictionary. Or how should I do this? I realize it cannot be that challenging. I don't want to use heapq. How can I do this?
Asked
Active
Viewed 1,562 times
3
-
What do you mean by highest values? Highest Number? Biggest length? – Zizouz212 Apr 14 '15 at 00:44
-
http://stackoverflow.com/questions/1555968/efficient-way-to-find-the-largest-key-in-a-dictionary-with-non-zero-value – NYB Apr 14 '15 at 00:45
-
Do you care what order the highest values are returned in? – martineau Apr 14 '15 at 00:55
-
@martineau I don't care in what order they are returned in – Apr 14 '15 at 02:51
-
@martineau I just need it to be done in the most basic manner possible – Apr 14 '15 at 02:51
-
@Zizouz212 Highest value, as in the keys with the largest integer values. – Apr 14 '15 at 02:52
4 Answers
3
I think the most pythonic way is:
sorted(d, key=d.get, reverse=True)[:4]

wim
- 338,267
- 99
- 616
- 750
-
-
I voted this up as definitely better than mine, but I think I would make one amendment nonetheless. I dislike iterating over a `dict`'s keys by simply iterating over the `dict` -- I think that implicit behavior confuses folks all the time. I would prefer as more explicit `sorted(d.keys(), key=d.get ... )` – jwilner Apr 14 '15 at 01:06
-
Using `d.get()` will be less optimal than the `O(n)` solution. Why not just iterate items? The solution used on `collection.Counter.most_common` is `sorted(self.items(), key=itemgetter(1), reverse=True)` – JBernardo Apr 14 '15 at 01:17
-
@jwilner I kind of get what your saying. Can you show me what that amendment would look like? – Apr 14 '15 at 02:49
-
@Beth, it's just slightly different from the above solutions. `max_keys = sorted(d.keys(), key=d.get)[:-4]` or alternatively, you could reverse if you like -- `sorted(d.keys(), key=d.get, reverse=True)[:4]`. My larger point is that people forget that `list(dict) == d.keys()`, and in those kinds of situations I think programmers should defer to the more explicit option. – jwilner Apr 14 '15 at 02:58
-
1@jwilner The problem with that is that it is wasteful on python 2.x, unnecessarily copying a list in memory which will be garbage collected later. That the dicts iterate over the keys anyway should be common python knowledge, imo. – wim Apr 14 '15 at 03:14
-
@JBernardo What O(n) solution do you speak about? Aren't yours and mine both O(n log n) ? – wim Apr 14 '15 at 03:17
-
@wim I mean the iteration. On worst case each `d.get(x)` can happen on O(n). Most of the time it will be a single hop in the hash table though – JBernardo Apr 14 '15 at 14:57
-
@JBernardo d.get is O(1). It's only going to be O(n) if every hash is colliding - and you have to intentionally sabotage a dict to create that situation, it's not going to just happen by accident. – wim Apr 15 '15 at 00:09
-
Are you sure you know about time complexity and dict implementation? Because this depends heavily on the dictionary history of inserts and deletes. Also the "n" in O(n) for `d.get` is not the current number of elements but the size of the container which is at least 33% bigger to allow less hash collisions. Iterating directly is always more efficient even with no collisions. – JBernardo Apr 15 '15 at 02:23
1
You should be using a collections.Counter
object instead.
c = collections.Counter(original_dict)
c.most_common(4)
If you just want the keys, then:
[k for k, v in c.most_common(4)]
For reference on how it is implemented, check the source code here.

JBernardo
- 32,262
- 10
- 90
- 115
-
@Shashank you can use `Counter` to count anything. Try the code yourself. Also this is much cleaner – JBernardo Apr 14 '15 at 01:13
-
-
0
Sort by the values and then slice. The sorted
function takes a key function.
items_by_value = sorted(d.items(), key=lambda (k, v): v)
keys = [k for k, v in items_by_value[-4:]]

jwilner
- 6,348
- 6
- 35
- 47
0
Good question.
Assuming d
is the dictionary, and if you don't care what order the keys are in, you could use something like the following:
keys = [v[0] for v in sorted(d.items(), key=lambda v: v[1])[-4:]]
For huge dictionaries, it would be a more efficient to use:
keys = [v[0] for v in sorted(d.items(), key=operator.itemgetter(1))[-4:]]
The [-4:]
means the last four entries. In both cases, if you want the keys in the same order as their corresponding highest values, use [-1:-5:-1]
instead, which are the last four entries in reverse order.

martineau
- 119,623
- 25
- 170
- 301