Ranking functions return a ranking value for each row in a partition. Depending on the function that is used, some rows might receive the same value as other rows. Ranking functions are non-deterministic.
Questions tagged [ranking-functions]
193 questions
20
votes
3 answers
How to add row number column in SQL Server 2012
I'm trying to add a new column to an existing table, where the value is the row number/rank. I need a way to generate the row number/rank value, and I also need to limit the rows affected--in this case, the presence of a substring within a…

Michael Carper
- 331
- 2
- 4
- 10
17
votes
4 answers
DataFrame: add column whose values are the quantile number/rank of an existing column?
I have a DataFrame with some columns. I'd like to add a new column where each row value is the quantile rank of one existing column.
I can use DataFrame.rank to rank a column, but then I don't know how to get the quantile number of this ranked…

luca
- 7,178
- 7
- 41
- 55
15
votes
1 answer
Efficiently determine "how sorted" a list is, eg. Levenshtein distance
I'm doing some research on ranking algorithms, and would like to, given a sorted list and some permutation of that list, calculate some distance between the two permutations. For the case of the Levenshtein distance, this corresponds to calculating…

stefan
- 1,511
- 9
- 13
13
votes
3 answers
SQL over clause - dividing partition into numbered sub-partitions
I have a challenge, that I've come across at multiple occasions but never been able to find an efficient solution to. Imagine I have a large table with data regarding e.g. bank accounts and their possible revolving moves from debit to…

Stanley Gade
- 133
- 6
12
votes
1 answer
Calculating SQL Server ROW_NUMBER() OVER() for a derived table
In some other databases (e.g. DB2, or Oracle with ROWNUM), I can omit the ORDER BY clause in a ranking function's OVER() clause. For instance:
ROW_NUMBER() OVER()
This is particularly useful when used with ordered derived tables, such as:
SELECT…

Lukas Eder
- 211,314
- 129
- 689
- 1,509
8
votes
2 answers
Algorithm/formula to calculate Product ranking on a ecommerce website(Based on following criteria)
I have been Working on algorithms and formulas to find out a score for the products available on my ecommerce website.Basically, I want to calculate some kind of score to rank the products when a user searches it.
I'll give some background on the…
user4115825
7
votes
4 answers
Rank array values with potential duplicate values and skipping some positions if there is a tie
I am working with database data that manipulates college students exam results. Basically, I am pulling the records from a MySQL database and pulling one class at any given time. I want to rank the students with the highest performer given the rank…

Bululu
- 555
- 2
- 8
- 14
6
votes
1 answer
Python Pandas qcut behavior with # of observations not divisible by # of bins
Suppose I had a pandas series of dollar values and wanted to discretize into 9 groups using qcut. The # of observations is not divisible by 9. SQL Server's ntile function has a standard approach for this case: it makes the first n out of 9 groups…

AllenQ
- 1,659
- 2
- 16
- 18
6
votes
2 answers
Finding repeated occurrences with ranking functions
Please help me generate the following query i've been struggling with for some time now. Lets' say I have a simple table with month number and information whether there were any failed events in this particular month
Below a script to generate…

Tomasz Zadrożny
- 63
- 4
5
votes
1 answer
Windowing function in Hive
I am exploring windowing functions in Hive and I am able to understand the functionalities of all the UDFs. Although, I am not able to understand the partition by and order by that we use with the other functions. Following is the structure that is…

satish silveri
- 358
- 3
- 17
5
votes
1 answer
Is NDCG (normalized discounted gain) flawed? I have calculated a few alternative ranking quality measures, and I can't make heads or tails of it
I'm using python for a learning-to-rank problem, and I am evaluating my success using the following DCG and NDCG code (from http://nbviewer.ipython.org/github/ogrisel/notebooks/blob/master/Learning%20to%20Rank.ipynb )
def dcg(relevances, rank=20):
…

neelshiv
- 6,125
- 6
- 21
- 35
5
votes
1 answer
What is the difference between point-wise and pair-wise ranking in machine learning
I've been reading papers on pairwise ranking and this is what I don't get:
what is the difference in the training/testing data between pointwise and pairwise ranking?
This is the paper that I have been…

Martin Klosi
- 3,098
- 4
- 32
- 39
4
votes
4 answers
Scalable time decay for web application
My goal here is to generate a system similar to that of the front page of reddit.
I have things and for the sake of simplicity these things have votes. The best system I've generated is using time decay. With a halflife of 7 days, if a vote is…

Piper Merriam
- 2,774
- 2
- 24
- 30
4
votes
1 answer
postgresql Select all rows which have max value in one column
I'm new to postgresql, I have a table T like this in postgresql:
C1 C2 C3 C4 ID
C1,C2, C4 are int.
C3 is a char
C4 is essentially the version number. and can be anything between 1-N.
( For example, after one set of insert events the values in table…

DJ_Stuffy_K
- 615
- 2
- 11
- 29
4
votes
1 answer
Variable rank that doesn't skip places in R
I have a vector that looks something like this:
> vec
[1] 1 1 2 5 10
I'm trying to convert this to a form of non-olympic ranking where ties are still ties, but the ranks are guaranteed 1 apart, even when there are multiple ties above it, so:
>…

Mittenchops
- 18,633
- 33
- 128
- 246