List of Keys, how to find max values in Dictionary

Question

I have been working on an assignment gathering data, and counting how many times each thing appears from a big dataset about 500mb. I have a couple of dictionaries reading csv files and putting data together and my final dict looks like this after all of the data has been gathered and worked on.

I am almost done with the assigment but am stuck on this section, I need to find the top 5 max values between all keys and values.

I have the following dictionary:

printed using: print key, task1[key]

KEY KEYVALUE

WA [[('1082225', 29), ('845195', 21), ('265021', 17)]]
DE [[('922397', 44), ('627084', 40), ('627297', 14)]]
DC [[('774648', 17), ('911624', 17), ('771241', 16)]]
WI [[('12618', 25), ('242582', 23), ('508727', 22)]]
WV [[('476050', 4), ('1016620', 3), ('769611', 3)]]
HI [[('466263', 5), ('226000', 5), ('13694', 4)]]

I pretty much need to go through and find the top 5 values and their ID number. for example

DE 922397 44
DE 627084 40
WA 1082225 29

What would be the best way to do this?

**EDIT how i am putting together my task dictionary

task1 = {}
for key,val in courses.items():
    task1[key] = [sorted(courses[key].iteritems(), key=operator.itemgetter(1), reverse=True)[:5]]

What does the actual dictionary look like? The thing you posted is not valid Python syntax. — Cory Kramer, Oct 14 '14 at 18:03
Relevant http://stackoverflow.com/questions/268272/getting-key-with-maximum-value-in-dictionary — Celeo, Oct 14 '14 at 18:03
@cyber that is what my python outputs when i print using: for key,val in courses.items(): print key, task1[key] — Andy P, Oct 14 '14 at 18:04
`task1[key] = [sorted(courses[key].iteritems(), key=operator.itemgetter(1), reverse=True)[:5]]` has an unnecessary pair of `[]` around it. This works just fine: `task1[key] = sorted(courses[key].iteritems(), key=operator.itemgetter(1), reverse=True)[:5]` — EML, Oct 14 '14 at 18:21

EML · Accepted Answer · 2014-10-14T18:38:02.567

2

Assuming your dict looks something like:

mydict = {'WA': [('1082225', 29), ('845195', 21), ('265021', 17)], 'DE': [('922397', 44), ('627084', 40), ('627297', 14)], ...}

This is not the ideal representation. If you run this, you can flatten the list into a better format:

data = [(k, idnum, v) for k, kvlist in mydict.items() for idnum, v in kvlist]

Now the data will look like:

[('WA', '1082225', 29), ('WA', '845195', 21), ('WA', '265021', 17), ('DE', '922397', 44), ...]

In this format, the data is clearly readable, and it is obvious what we need to search. This line will sort the new tuples in descending order according to their [2] value:

sorted(data, key=lambda x: x[2], reverse=True)

Note: the dictionary you provided has an unnecessary [], so I removed that from the answer for clarity.

Edited after clarification.

edited Oct 14 '14 at 18:38

answered Oct 14 '14 at 18:20

EML

435
4
12

It also fails if the max value is not in the first tuple of the list – smac89 Oct 14 '14 at 18:24
@Smac89 the list is being generated reverse sorted – cmd Oct 14 '14 at 18:25
After clarification, I modified the above answer. – EML Oct 14 '14 at 18:31
@cmd Oh I didn't see the updated post. So it seems OP is already reverse sorting each list – smac89 Oct 14 '14 at 18:31

List of Keys, how to find max values in Dictionary

1 Answers1