0

Let's say I have an list:

[4, 5, 2, 1]

I need to rank these and have the output as:

[3, 4, 2, 1]

If two have the same ranking in the case:

[4, 4, 2, 3] then the rankings should be averaged -> [3.5, 3.5, 1, 2]

EDIT
Here rank stands for position of number in a sorted list. If there are multiple numbers with same value, then rank of each such number will be average of their positions.

niyasc
  • 4,440
  • 1
  • 23
  • 50
Steve
  • 2,764
  • 4
  • 27
  • 32

2 Answers2

2

Probably not the most efficient, but this works.

  • rank takes a sorted list and an item, and figures out the rank of that item should be by finding where it would be inserted to go before all elements that are equal to it, and after, then averaging the two positions (using array bisection).
  • rank_list uses rank to figure out the ranks of all elements. The partial call is just to simplify, and not have to sort the list again for each item lookup.

Like so:

from bisect import bisect_left, bisect_right
from functools import partial

def rank(item, lst):
    '''return rank of item in sorted list'''
    return (1 + bisect_left(lst, item) + bisect_right(lst, item)) / 2.0

def rank_list(lst):
    f = partial(rank, lst=sorted(lst))
    return [f(i) for i in lst]

rank_list([4, 4, 2, 1])
## [3.5, 3.5, 2.0, 1.0]
tzaman
  • 46,925
  • 11
  • 90
  • 115
  • I timed the two solutions here with a list of 3 items: The solution I posted below took 59% longer to run – Steve Mar 27 '15 at 08:49
0

I found an answer to this here: Efficient method to calculate the rank vector of a list in Python

def rank_simple(vector):
    return sorted(range(len(vector)), key=vector.__getitem__)

def rankdata(a):
    n = len(a)
    ivec=rank_simple(a)
    svec=[a[rank] for rank in ivec]
    sumranks = 0
    dupcount = 0
    newarray = [0]*n
    for i in xrange(n):
        sumranks += i
        dupcount += 1
        if i==n-1 or svec[i] != svec[i+1]:
            averank = sumranks / float(dupcount) + 1
            for j in xrange(i-dupcount+1,i+1):
                newarray[ivec[j]] = averank
            sumranks = 0
            dupcount = 0
    return newarray

I would like to see if there are any simpler or more efficient ways of doing this.

Community
  • 1
  • 1
Steve
  • 2,764
  • 4
  • 27
  • 32