2

I have a python dictionary named marks:

marks={’joe’:4, ’suzan’:5, ’peter’:4, ’bob’:4, ’john’:6, ’ann’:5, ’fred’:3}

I want to produce a list of students according to their rank at that test. More precisely, build a list rank such that:

  1. rank[0] contains the list of best students,
  2. rank[1] contains the list of second best students, etc.

The output should be:

[[’john’], [’ann’, ’suzan’], [’joe’, ’bob’, ’peter’], [’fred’]]

Someone can help me? I'm trying for a while but I do not find any easy solutions.. Thanks

5 Answers5

2

Python has some great builtin libraries like itertools and operator.

heres a one liner:

from itertools import groupby
from operator import itemgetter

marks = {'ann': 5, 'bob': 4, 'fred': 3, 'joe': 4, 'john': 6, 'peter': 4, 'suzan': 5}

print [[name for name, rank in g] for k, g in groupby(sorted(marks.iteritems(), key=itemgetter(1), reverse=True), key=itemgetter(1))]

I based my implementation heavily on the example here.

Doody P
  • 4,611
  • 2
  • 20
  • 18
  • I think this is the best approach. One possible alternative would be to use `marks.get` as the key function and sort just `marks` (the keys of the dictionary) rather than the items. This would make forming the inner lists of names much simpler, since you can just call `list`: `[list(g) for k, g in groupby(sorted(marks, key=marks.get, reverse=True), key=marks.get)]`. This might be slightly slower though. – Blckknght Jan 13 '15 at 19:07
  • It's slower to make extra generators you don't need and then do list comprehension on them. – user3467349 Jan 13 '15 at 19:18
1

Generic algorithm (in pseudocode) would look like this:

temp = {}
for student, rank in marks:
    if not temp has key rank:
        temp[rank] = []
    push student to temp[rank]
keys = sort keys of temp
result = []
for key in keys:
    push temp[key] to result
return result

UPDATE:

Python implementation:

marks={'joe':4, 'suzan':5, 'peter':4, 'bob':4, 'john':6, 'ann':5, 'fred':3}

temp = {}

for student in marks:
    mark = marks[student]
    if not temp.has_key(mark):
        temp[mark] = []
    temp[mark].append(student)

ranks = sorted([v for v in temp])
ranks.reverse()

result = []

for rank in ranks:
    result.append(temp[rank])

print result
iwlagn
  • 477
  • 3
  • 10
1

A more readable alternative using a defaultdict would be

marks = {'ann': 5, 'bob': 4, 'fred': 3, 'joe': 4, 'john': 6, 'peter': 4, 'suzan': 5}

from collections import defaultdict
rankings = defaultdict(list)
for name,score in marks.iteritems():
    rankings[score].append(name)

You know have that rankings[6] is a list of all the best, i.e. in this case rankings[6]=[john].

To recover the list of rankings from this dict you can sort the items by key:

result = []
for key in sorted(rankings,reverse=True):
    result.append(rankings[key])
gg349
  • 21,996
  • 5
  • 54
  • 64
  • Note that an `OrderedDict` returns its keys in the order they were first added, not in sorted order. That means it's not terribly useful to you here. While data structures that sort their keys automatically do exist (i.e. various kinds of sort trees), I don't think there are any built into Python's standard library (except perhaps as part of database packages). – Blckknght Jan 13 '15 at 19:33
  • @Blckknght, absolutely good point, my mistake. Edited accordingly. – gg349 Jan 13 '15 at 21:06
1

One Line

marks = {'ann': 5, 'bob': 4, 'fred': 3, 'joe': 4, 'john': 6, 'peter': 4, 'suzan': 5}

[[name for name, val in marks.items() if val == score]\
 for score in sorted(set(marks.values()))]

edit: changed names for clarity.

user3467349
  • 3,043
  • 4
  • 34
  • 61
  • There's no need for the lambda in this expression: `[[name for name, val in marks.items() if val == score] for score in set(marks.values)]` does just as well. It is however worth noting that it may not present the scores in any particular order. It will often be increasing order (and probably works for the example data), since integers hash to themselves, but for larger, more widely distributed scores it wont work the same way. – Blckknght Jan 13 '15 at 19:02
  • Thanks, added sorted, removed lambda. – user3467349 Jan 13 '15 at 19:43
0

I of course don't know your use case, but it seems like it would be better to store the information in a new "inverted dict" instead of creating a list of separate lists of students without any score information attached to them:

marks_inverted = {v: [] for v in marks.values()}
for k,v in marks.items():
    marks_inverted[v].append(k)
print(marks_inverted)
#OUTPUT: {6: ['john'], 5: ['suzan', 'ann'], 4: ['bob', 'joe', 'peter'], 3: ['fred']}
#Note that as a dict, marks_inverted doesn't have any order

Now you can look up the list of students who got each score:

print(marks_inverted[6])
#OUTPUT: ['john']

Or the maximum score:

print(marks_inverted[max(marks_inverted)]
#OUTPUT: ['john']

If you use a collections.defaultdict instead of a regular dict, you won't have to worry about getting an error if you ask for a score nobody got, and you don't have to initialize the empty lists first (idea for this came from here):

from collections import defaultdict as dd
marks_inverted = dd(list)
for k,v in marks.items():
    marks_inverted[v].append(k) 
    #if marks_inverted[v] doesn't exist it is initialized as [] before it is appended

Now if you look up 2, you won't get an error even though nobody got a 2:

print(marks_inverted[2])
#OUTPUT: []

If you're using a collections.defaultdict and you want to get the students who performed over a range of scores, you can do it like this:

[marks_inverted[k] for k in range(2,5) if marks_inverted[k]]
#OUTPUT: [['fred'], ['bob', 'joe', 'peter']]

...or if you want the list flattened:

[name for index in [marks_inverted[k] for k in range(2,5) if marks_inverted[k]] for name in index]
#OUTPUT: ['fred', 'bob', 'joe', 'peter']

(The if marks_inverted[k] part of the statement makes it so no empty lists are included as members of the list output.)

If at any time you really do need to get rid of the score information and just have a list of students in order of highest to lowest marks (like in the original question), you can just do:

[marks_inverted[k] for k in reversed(sorted(marks_inverted)) if marks_inverted[k]]
#OUTPUT: [['john'], ['ann', 'suzan'], ['bob', 'joe', 'peter'], ['fred']]
Community
  • 1
  • 1
Rick
  • 43,029
  • 15
  • 76
  • 119
  • I would say if he needs to access the information more than once a sorted list (of tuples) would be better than a dict - then he can bisect particular score ranges (i.e. 60-70) instead of being stuck with dict keys like 69.5 or 72.3. – user3467349 Jan 13 '15 at 16:07
  • 1
    It doesn't need to divorce the name - just do sorted(marks.items(), key = lambda x: x[1]) – user3467349 Jan 13 '15 at 16:16
  • That's a nice way of doing it. My thinking on using the "inverted dict" idea is that it looks like there are only 6 or 7 scores, and no in-betweens or decimals. If the scores ranged 0 to 100 with decimals, your idea would definitely be the way to go. – Rick Jan 13 '15 at 16:48
  • Edited to include a way to get range bisections (again assuming that all scores are integers). – Rick Jan 13 '15 at 18:36