How to take two columns into consideration when creating a rank column in R?

Question

I have a dataframe that looks like the following:

gene_id         Peak         Gene_Symbol TPM  Liver_Rank Heart_Rank
ENSG00000000003 Peak_11824      TSPAN6  34.51   2508    2768
ENSG00000000003 Peak_144083     TSPAN6  34.51   2508    2768
ENSG00000000005 Peak_174044     TNMD    0.42    43      NA
ENSG00000000419 Peak_7341       DPM1    19.97   1844    1484
ENSG00000000457 Peak_179030     SCYL3   1.52    153     2775
ENSG00000000457 Peak_179030     SCYL3   1.52    153     2775
ENSG00000000457 Peak_179030     SCYL3   1.52    153     2775
ENSG00000000457 Peak_176186     SCYL3   1.52    153     2775
ENSG00000000457 Peak_176186     SCYL3   1.52    153     2775

What I want to do is add a new Rank column that takes into account Heart_Rank and Liver_Rank, such that, the closer the Heart_Rank and Liver_Rank are to 1, the closer the value in the Rank column is to 1. And I'd like the rank column to be from 1 to however many rows there are, with no gaps between ties. How can you achieve this?

I'm aware of how to add a rank column, for instance via:

df$Liver_Rank <- match(df$TPM, sort(unique(df$TPM), decreasing=F))

But I'm not sure how I would incorporate two columns in the method above. Thanks!

The method in the duplicate question does not do the ranking in consecutive order; it jumps from `1` to `16` in the `Rank` column when I implement that method. — claudiadast, Jun 28 '19 at 17:57
OK! Use `ties.method = "dense"` "which returns the ranks without any gaps in the ranking": `d[ , r := frank(d, Liver_Rank, Heart_Rank, ties.method = "dense")]` — Henrik, Jun 28 '19 at 18:09

How to take two columns into consideration when creating a rank column in R?

0 Answers0