0

Given a dataframe in R, how can you find the index of the maximum value across multiple columns (excluding NAs) and also the column name of the index of the maximum value across the columns.

Matt
  • 7,255
  • 2
  • 12
  • 34
omirz
  • 37
  • 3

1 Answers1

3

You can use which.max with apply to get the column number where you find the first max. This can be used in colnames to get the name of this column.

x$indexOfMax <- apply(x, 1, which.max)
x$colName <- colnames(x)[x$indexOfMax]
x
#  Red Blue Yellow Green Purple indexOfMax colName
#1   5    8     10     3     NA          3  Yellow
#2   3    7      2    NA      1          2    Blue
#3   3   NA     NA     2      8          5  Purple

Data:

x <- data.frame(Red=c(5,3,3), Blue=c(8,7,NA), Yellow=c(10,2,NA)
 , Green=c(3,NA,2), Purple=c(NA,1,8))
GKi
  • 37,245
  • 2
  • 26
  • 48
  • Thanks for this, it works well. However, when trying it on a larger dataset there are some instances where all columns contain NA values. Then the indexOfMax column changes to a different format e.g. integer(0) or c(Purple = 5) rather than just 5. Is there a workaround for when all are NAs? – omirz Jul 01 '20 at 13:58
  • When there are only `NA` you usually can not say where the largest number is. Can you make an update of your question which shows the problem? – GKi Jul 02 '20 at 05:44