1

Let's say I have a dataframe like the one below and want to take the highest value in each row and put its associated column name into a new vector (not the value itself), how could I go about this?

df <- data.frame(matrix(rnorm(50, 20), 5))

  X1       X2       X3       X4       X5       X6       X7       X8       X9      X10
1 18.49755 18.98823 18.53194 18.86478 20.74333 18.04460 21.08717 21.75072 18.05813 19.08402
2 20.44626 20.07205 19.36755 17.14943 18.58396 20.76463 20.23776 18.90171 18.99182 20.51338
3 20.27142 18.74448 21.42953 20.13568 20.40065 22.26788 19.30967 20.51772 19.20067 19.75371
4 20.61600 21.27852 18.54137 20.84269 20.27767 20.70583 21.33051 20.03136 20.60405 21.24672
5 19.64165 21.20197 20.06732 19.59529 20.48761 19.83571 19.80155 21.02669 20.77574 21.21862

I have tried

results <- apply(df, 1, max)

which gives me the highest value but I am more interested in the column name associated with the highest value being added to the result vector, not the value itself.

So, instead of a vector of the 5 highest values by row, I would have a vector of the column names that "won" such as

result <- c("X1", "X3", "X2", "X1", "X9")

Thank you.

Seanosapien
  • 432
  • 1
  • 5
  • 17

3 Answers3

4

Use which.max:

names(df)[apply(df, 1,which.max)]
Abdou
  • 12,931
  • 4
  • 39
  • 42
2

You can add a step into the apply function to return the column name associated with the max

Note that I've used set.seed() as you're making a random sample

set.seed(123)

df <- data.frame(matrix(rnorm(50, 20), 5))

apply(df, 1, function(x) {  
    names(x)[x == max(x)]  
})

# [1] "X4" "X6" "X1" "X9" "X6"

df

#         X1       X2       X3       X4       X5       X6       X7       X8       X9      X10
# 1 19.43952 21.71506 21.22408 21.78691 18.93218 18.31331 20.42646 20.68864 19.30529 18.87689
# 2 19.76982 20.46092 20.35981 20.49785 19.78203 20.83779 19.70493 20.55392 19.79208 19.59712
# 3 21.55871 18.73494 20.40077 18.03338 18.97400 20.15337 20.89513 19.93809 18.73460 19.53334
# 4 20.07051 19.31315 20.11068 20.70136 19.27111 18.86186 20.87813 19.69404 22.16896 20.77997
# 5 20.12929 19.55434 19.44416 19.52721 19.37496 21.25381 20.82158 19.61953 21.20796 19.91663

And just for kicks, an over-the-top dplyr & reshape2 variation

library(dplyr)
library(reshape2)
df$row <- row.names(df)

melt(df) %>% 
    group_by(row) %>%
    arrange(desc(value)) %>%
    slice(1) 

# Source: local data frame [5 x 3]
# Groups: row [5]
# 
# row variable    value
# (chr)   (fctr)    (dbl)
# 1     1       X4 21.78691
# 2     2       X6 20.83779
# 3     3       X1 21.55871
# 4     4       X9 22.16896
# 5     5       X6 21.25381
SymbolixAU
  • 25,502
  • 4
  • 67
  • 139
0

I did it this way, let me know if its good for you.

df <- data.frame(matrix(rnorm(50, 20), 5))

my_list <- {}

for (i in 1:nrow(df)){
  x <- df[i,]
  y <- sort(x,decreasing = T)
  my_list[i] <- paste0("X",grep(y[1],x))
}

> my_list
[1] "X5" "X7" "X7" "X6" "X8"
> df
        X1       X2       X3       X4       X5       X6       X7       X8       X9      X10
1 19.22859 19.78252 20.08969 19.60546 21.09189 18.27778 18.53504 19.38758 18.14770 20.64938
2 20.23044 21.90423 19.91845 21.06613 21.82551 21.08873 22.05754 19.81582 20.74686 19.38851
3 19.83008 19.58174 21.42340 19.66734 20.64790 19.72775 22.35714 19.23881 21.81957 19.44846
4 20.96194 20.17202 20.82502 19.11394 20.18380 21.64440 19.46687 19.73009 18.89267 20.89549
5 19.83232 20.40958 19.94605 19.49419 19.80325 20.39628 19.59710 21.84272 20.02212 21.22459
> 
Dinesh.hmn
  • 713
  • 7
  • 21