I would like to obtain (in an new column in the data.table) the column name of the column that contains the maximum value in only a few columns in a data.frame.
Here is an example data.frame
# creating the vectors then the data frame ------
id = c("a", "b", "c", "d")
ignore = c(1000,1000, 1000, 1000)
s1 = c(0,0,0,100)
s2 = c(100,0,0,0)
s3 = c(0,0,50,0)
s4 = c(50,0,50,0)
df1 <- data.frame(id,ignore,s1,s2,s3,s4)
(1) now I want to find the column name of the maximum number in each row, from the columns s1-s4. (i.e. ignore the column called "ignore")
(2) If there is a tie for the maximum, I would like the last (e.g. s4) column name returned.
(3) as an extra favour - if all are 0, I would ideally like NA returned
here is my best attempt so far
df2 <- cbind(df1,do.call(rbind,apply(df1,1,function(x) {data.frame(max.col.name=names(df1)[which.max(x)],stringsAsFactors=FALSE)})))
this returns ignore in each case, and (except for row b) works if I remove this column, and reorder the s1-s4 columns as s4-s1.
How would you approach this?
Many thanks indeed.