1

Quick note regarding similar questions: This question was initially flagged as similar to How to emulate SQLs rank functions in R?, and How to get ranks with no gaps when there are ties among values?. However, the first reference assumes the user understands the SQL rank function, which I and probably many other users don't know SQL or its rank function, and the second reference addresses ranking tied elements whereas my question addresses both ties and large gaps (>1) between the elements. I believe my question title is broader and better verbalizes the issue for users. Thus I petition to leave this question stand.

The original question re: relative rankings of elements: I've been playing around with rank(), order(), seq(), list(), and unlist() functions in R in order to get the relative ranking of each element in a list in R. In the two examples illustrated below, I'm trying to derive the yellow columns showing the relative rank of each element. How can this be done? I have a preference for dplyr if it's easier to execute than in base R.

enter image description here

When I run the following code for the illustrated Example 1, I get these results which is not what I want:

> rank(Example1$Element)
[1] 3 3 3 6 3 7 3

Example1 <- data.frame(Element = c(1,1,1,2,1,3,1))
rank(Example1$Element)
  • Note that it's debatable whether in ex1 2 and 3 are the 2nd and 3rd highest. In sports when there's a tie in 1st place, the next athlete or team is ranked in 3rd place, not the silver medal. `rank(Example1$Element, ties.method = "min")` accounts for that. – Rui Barradas Jul 15 '22 at 09:06

3 Answers3

2

In base R, a trick is to use as.numeric with as.factor:

as.numeric(as.factor(Example1$Element))
# [1] 1 1 1 2 1 3 1
Maël
  • 45,206
  • 3
  • 29
  • 67
2

You can also use match, i.e.

match(Example1$Element, sort(unique(Example1$Element)))
#[1] 1 1 1 2 1 3 1
Sotos
  • 51,121
  • 6
  • 32
  • 66
  • 2
    Note that you need a `sort()` around the `unique()` to get the correct result if the unique levels don't already appear in sorted order in the data. – Mikko Marttila Jul 15 '22 at 09:18
2

I think you are looking for dplyr::dense_rank():

# Example 1
dplyr::dense_rank(c(1, 1, 1, 3, 1, 4, 1))
#> [1] 1 1 1 2 1 3 1

# Example 2
dplyr::dense_rank(c(4, 1, 1, 1, 3, 5, 1))
#> [1] 3 1 1 1 2 4 1

# Example in code
dplyr::dense_rank(c(1, 1, 1, 2, 1, 3, 1))
#> [1] 1 1 1 2 1 3 1
Mikko Marttila
  • 10,972
  • 18
  • 31
  • 1
    Some `dbplyr` backends also translate `dense_rank()` to SQL queries when working with database connections, making it slightly more widely applicable than a base R solution. – Mikko Marttila Jul 15 '22 at 09:28