I am beating my brains out on something that is probably straight forward. I want to get a "dense" ranking (as defined for the data.table::frank
function), on a column in a data frame, but not based on the columns proper order, the order should be given by another column (val
in my example)
I managed to get the dense ranking with @Prasad Chalasani 's solution, like that:
library(dplyr)
foo_df <- data.frame(id = c(4,1,1,3,3), val = letters[1:5])
foo_df %>% arrange(val) %>% mutate(id_fac = as.integer(factor(id)))
#> id val id_fac
#> 1 4 a 3
#> 2 1 b 1
#> 3 1 c 1
#> 4 3 d 2
#> 5 3 e 2
But I would like the factor levels to be ordered based on val
. Desired output:
foo_desired <- foo_df %>% arrange(val) %>% mutate(id_fac = as.integer(factor(id, levels = c(4,1,3))))
foo_desired
#> id val id_fac
#> 1 4 a 1
#> 2 1 b 2
#> 3 1 c 2
#> 4 3 d 3
#> 5 3 e 3
- I tried
data.table::frank
- I tried both methods by @Prasad Chalasani.
- I tried setting the order of
id
usingid[rank(val)]
(andsort(val)
, andorder(val)
). Finally, I also tried to sort the levels using
rank(val)
etc, but this throws an error (Evaluation error: factor level [3] is duplicated.
)I know that one can specify the level order, I used this for creation of the desired output. This solution is however not great as my data has way more rows and levels
I need that for convenience, in order to produce a table with a specific order, not for computations.
Created on 2018-12-19 by the reprex package (v0.2.1)