How to sort a dictionary to output from only highest value?

Question

txt would contain a something like this:

Matt Scored: 10
Jimmy Scored: 3
James Scored: 9
Jimmy Scored: 8
....

My code so far:

   from collections import OrderedDict
#opens the class file in order to create a dictionary
dictionary = {}
#splits the data so the name is the key while the score is the value
f = open('ClassA.txt', 'r')
d = {}
for line in f:
    firstpart, secondpart = line.strip().split(':')
    dictionary[firstpart.strip()] = secondpart.strip()
    columns = line.split(": ")
    letters = columns[0]
    numbers = columns[1].strip()
    if d.get(letters):
        d[letters].append(numbers)
    else:
        d[letters] = list(numbers)
#sorts the dictionary so it has a alphabetical order
sorted_dict = OrderedDict(
sorted((key, list(sorted(vals, reverse=True))) 
       for key, vals in d.items()))
print (sorted_dict)

This code already produces a output of alphabetically sorted names with their scores from highest to lowest printed. However now I require to be able to output the names sorted in a way that the highest score is first and lowest score is last. I tried using the max function however it outputs either only the name and not the score itself, also I want the output to only have the highest score not the previous scores like the current code I have.

Should be fairly easy using `itertools.groupby`. Let me work something up — Adam Smith, Jan 27 '15 at 21:46
http://stackoverflow.com/questions/268272/getting-key-with-maximum-value-in-dictionary — user3809875, Jan 27 '15 at 21:47

score 1 · Answer 1 · answered Jan 27 '15 at 21:52

1

I do not think you need dictionary in this case. Just keep scores as a list of tuples.

I.e. sort by name:

>>> sorted([('c', 10), ('b', 16), ('a', 5)], 
           key = lambda row: row[0])
[('a', 5), ('b', 16), ('c', 10)]

Or by score:

>>> sorted([('c', 10), ('b', 16), ('a', 5)], 
           key = lambda row: row[1])
[('a', 5), ('c', 10), ('b', 16)]

answered Jan 27 '15 at 21:52

myaut

11,174
2
30
62

1

you should use `operator.itemgetter` in these cases, rather than writing your own anonymous function – Adam Smith Jan 27 '15 at 22:03

score 0 · Answer 2 · answered Jan 27 '15 at 22:02

You can use itertools.groupby to separate out each key on its own. That big long dict comp is ugly, but it works essentially by sorting your input, grouping it by the part before the colon, then taking the biggest result and saving it with the group name.

import itertools, operator

text = """Matt Scored: 10
Jimmy Scored: 3
James Scored: 9
Jimmy Scored: 8"""

result_dict = {group:max(map(lambda s: int(s.split(":")[1]), vals)) for
               group,vals in itertools.groupby(sorted(text.splitlines()),
                                               lambda s: s.split(":")[0])}

sorted_dict = sorted(result_dict.items(), key=operator.itemgetter(1), reverse=True)
# result:
[('Matt Scored', 10), ('James Scored', 9), ('Jimmy Scored', 8)]

unrolling the dict comp gives something like:

sorted_txt = sorted(text.splitlines())
groups = itertools.groupby(sorted_txt, lambda s: s.split(":")[0])
result_dict = {}
for group, values in groups:
    # group is the first half of the line
    result_dict[group] = -1
    # some arbitrary small number
    for value in values:
        #value is the whole line, so....
        value = value.split(":")[1]
        value = int(value)
        result_dict[group] = max(result_dict[group], value)

score 0 · Answer 3 · edited May 23 '17 at 12:12

I would use bisect.insort from the very beginning to have a sorted list whenever you insert a new score, then it's only a matter of reversing or slicing the list to get the desired output:

from bisect import insort
from StringIO import StringIO

d = {}
f = '''Matt Scored: 10
Jimmy Scored: 3
James Scored: 9
Jimmy Scored: 8'''

for line in StringIO(f):
    line = line.strip().split(' Scored: ')
    name, score = line[0], int(line[1])
    if d.get(name):
        # whenever new score is inserted, it's sorted from low > high
        insort(d[name], score)
    else:
        d[name] = [score]

d

{'James': [9], 'Jimmy': [3, 8], 'Matt': [10]}

Then to get the desired output:

for k in sorted(d.keys()):
    # score from largest to smallest, sorted by names
    print 'sorted name, high>low score  ', k, d[k][::-1]
    # highest score, sorted by name
    print 'sorted name, highest score ', k, d[k][-1]

Results:

sorted name, high>low score   James [9]
sorted name, highest score  James 9
sorted name, high>low score   Jimmy [8, 3]
sorted name, highest score  Jimmy 8
sorted name, high>low score   Matt [10]
sorted name, highest score  Matt 10

As a side note: list[::-1] == reversed list, list[-1] == last element

score 0 · Answer 4 · answered Feb 10 '15 at 14:41

Your code can be simplified a bit using a defaultdict

from collections import defaultdict
d = defaultdict(list)

Next, it's a good practice to use the open context manager when working with files.

with open('ClassA.txt') as f:

Finally, when looping through the lines of f, you should use a single dictionary, not two. To make sorting by score easier, you'll want to store the score as an int.

    for line in f:
        name, score = line.split(':')
        d[name.strip()].append(int(score.strip()))

One of the side effects of this approach is that scores with multiple digits (e.g., Jimmy Scored: 10) will keep their value (10) when creating a new list. In the original version, list('10') results in list['1', '0'].

You can them use sorted's key argument to sort by the values in d rather than its keys.

sorted(d, key=lambda x: max(d[x]))

Putting it all together we get

from collections import defaultdict
d = defaultdict(list)
with open('ClassA.txt') as f:
    for line in f:
        name, score = line.split(':')
        d[name.strip()].append(int(score.strip()))

# Original
print(sorted(d.items()))

# By score ascending
print(sorted(d.items(), key=lambda x: max(x[1])))

# By score descending
print(sorted(d.items(), key=lambda x: max(x[1]), reverse=True))

How to sort a dictionary to output from only highest value?

4 Answers4