0

I have a dataframe that looks like the following:

gene_id         Peak         Gene_Symbol TPM  Liver_Rank Heart_Rank
ENSG00000000003 Peak_11824      TSPAN6  34.51   2508    2768
ENSG00000000003 Peak_144083     TSPAN6  34.51   2508    2768
ENSG00000000005 Peak_174044     TNMD    0.42    43      NA
ENSG00000000419 Peak_7341       DPM1    19.97   1844    1484
ENSG00000000457 Peak_179030     SCYL3   1.52    153     2775
ENSG00000000457 Peak_179030     SCYL3   1.52    153     2775
ENSG00000000457 Peak_179030     SCYL3   1.52    153     2775
ENSG00000000457 Peak_176186     SCYL3   1.52    153     2775
ENSG00000000457 Peak_176186     SCYL3   1.52    153     2775

What I want to do is add a new Rank column that takes into account Heart_Rank and Liver_Rank, such that, the closer the Heart_Rank and Liver_Rank are to 1, the closer the value in the Rank column is to 1. And I'd like the rank column to be from 1 to however many rows there are, with no gaps between ties. How can you achieve this?

I'm aware of how to add a rank column, for instance via:

df$Liver_Rank <- match(df$TPM, sort(unique(df$TPM), decreasing=F))

But I'm not sure how I would incorporate two columns in the method above. Thanks!

claudiadast
  • 591
  • 3
  • 11
  • 33
  • The method in the duplicate question does not do the ranking in consecutive order; it jumps from `1` to `16` in the `Rank` column when I implement that method. – claudiadast Jun 28 '19 at 17:57
  • 1
    OK! Use `ties.method = "dense"` "which returns the ranks without any gaps in the ranking": `d[ , r := frank(d, Liver_Rank, Heart_Rank, ties.method = "dense")]` – Henrik Jun 28 '19 at 18:09

0 Answers0