2

I have a dictionary:

d = {'a':[1,3], 'b':[3,4,5,6], 'c':[1], 'd':[1,2,3] }

I want to make a smaller, new dictionary with the top two key:value pairs, sorted by the len of the lists in value. So in this case, I want:

newd = {'b':[3,4,5,6], 'd':[1,2,3] }

I tried this answer but got this error:

NameError: global name 'd' is not defined
Community
  • 1
  • 1
mchangun
  • 9,814
  • 18
  • 71
  • 101
  • Please post exactly how you tried to implement the other solution. It looks *exactly* like what you wanted, so you must have made a mistake when you adapted the code. When I use the other code it works perfectly fine for me. – Felix Kling Dec 09 '13 at 06:21
  • @FelixKling I didn't adapt the code. I just copy and pasted from the answer. – mchangun Dec 09 '13 at 06:26
  • 2
    Well, I did as well and it works for me. See http://codepad.org/RyobPnHY. Did you copy the `>>>` as well by any chance? As it is, this question is just a duplicate. We cannot help you to learn to copy&paste code properly. – Felix Kling Dec 09 '13 at 06:26
  • "top two key:value pairs, sorted by the `len` of the lists in value" - you know dicts have no order, right? If you just want the key-value pairs with the top two longest values, you're fine, but if you want the dict to be sorted by value length, you're going to have problems. – user2357112 Dec 09 '13 at 07:36
  • @user2357112 I don't care about the order in the new dictionary, just that they contain the top n key:values pairs according to length of value. – mchangun Dec 09 '13 at 08:11

2 Answers2

4

One approach is to sort the items according to length of value, and then create a dictionary from the last two items.

sorted_items = sorted(d.items(), key = lambda item : len(item[1]))
newd = dict(sorted_items[-2:])
oadams
  • 3,019
  • 6
  • 30
  • 53
  • 1
    I'd argue that passing `reverse=true` to `sorted` is more performant than reversing the whole list afterwards. It's probably also helpful if you explain what `[::-1]` means exactly. – Felix Kling Dec 09 '13 at 06:25
  • Or slice off the last two `dict(sorted_items[-2:])` – John La Rooy Dec 09 '13 at 06:27
  • @Felix King: It probably is. Adjusted the code to just slice the last two anyway. – oadams Dec 09 '13 at 06:28
  • @gnibbler, I was making that edit similar to what you suggested, but reversal is irrelevant if you just slice the last two. – oadams Dec 09 '13 at 06:29
3

Seems like a job for heapq:

big_items = heapq.nlargest(2, d.items(), key=lambda x: len(x[1]))
newd = dict(big_items)

The advantage of heapq over sorted is that this provides an O(N) time complexity whereas sorted will yield an O(NlogN) time complexity. For small dicts, this probably doesn't matter much. sorted may even be faster due to a more optimized implementation, but for big dicts, this might actually buy you a significant speedup.


(on python2.x, you can use d.iteritems() instead of d.items())

mgilson
  • 300,191
  • 65
  • 633
  • 696