1

There are several posts about returning the column name of the largest value of a data frame. (like this post: For each row return the column name of the largest value)

However, my problem is a bit more complicated than this, I am wondering what code should I use if I would like to return the column names of the largest two (or three, or even ten) data by R? To make it more clear, you can use this example code:

DF <- data.frame(V1=c(2,8,1),V2=c(7,3,5),V3=c(9,6,4))

Which will return something like:

  V1 V2 V3
1  2  7  9
2  8  3  6
3  1  5  4

I want to get the column names of the largest two columns, so in this case, it should be something like:

1  V3  V2 
2  V1  V3 
3  V2  V3 

Thanks very much for your help in advance! :)

Jingjun
  • 177
  • 7

2 Answers2

1

Using pmap

library(purrr)
pmap(DF, ~ {tmp <- c(...); head(names(tmp)[order(-tmp)], 2)})

-output

[[1]]
[1] "V3" "V2"

[[2]]
[1] "V1" "V3"

[[3]]
[1] "V2" "V3"

or with dapply from collapse

library(collapse)
slt(dapply(DF, MARGIN = 1, FUN = function(x) colnames(DF)[order(-x)]), 1:2)
  V1 V2
1 V3 V2
2 V1 V3
3 V2 V3
akrun
  • 874,273
  • 37
  • 540
  • 662
0
DF <- data.frame(V1=c(2,8,1),V2=c(7,3,5),V3=c(9,6,4))
DF
#>   V1 V2 V3
#> 1  2  7  9
#> 2  8  3  6
#> 3  1  5  4

largest <- colnames(DF)[apply(DF, 1, FUN = function(x) which(x == sort(x, decreasing = TRUE)[1]))]
secondlargest <- colnames(DF)[apply(DF, 1, FUN = function(x) which(x == sort(x, decreasing = TRUE)[2]))]

cbind(largest, secondlargest)
#>      largest secondlargest
#> [1,] "V3"    "V2"         
#> [2,] "V1"    "V3"         
#> [3,] "V2"    "V3"
Skaqqs
  • 4,010
  • 1
  • 7
  • 21