R: Get column name based on row values in R

Question

I have a table like below, would like to crate suggestions based on row value in R.

This is what I have -

id  class1  class2  class3  class4
A   0.98    0.48    0.21    0.99
B   0.22    0.31    0.41    0.11
C   0.70    0.81    0.61    0.21

I would like to have two new columns ('sugg1', 'sugg2') that will give the column names of the two top maximum values for each row i.e. for the first row, 0.99 is the maximum value, so its corresponding column name is class4, and the next max value is 0.98 for which the column name is class1.

id  class1  class2  class3  class4  sugg1   sugg2
A   0.98    0.48    0.21    0.99    class4  class1
B   0.22    0.31    0.41    0.11    class3  class2
C   0.70    0.81    0.61    0.21    class2  class1

The max of the class columns goes to sugg1 and the second best to sugg2 — Sambit Nandi, Aug 20 '16 at 18:17

akrun · Accepted Answer · 2016-08-20T18:37:52.997

We can use apply with MARGIN = 1 to loop over the rows, sort the values in the rows decreasing, get the first 2 (head(...)), transpose the output and create two new columns in the original dataset.

df1[paste0("sugg", 1:2)] <- t(apply(df1[-1], 1, FUN = function(x) names(head(sort(-x),2))))

df1
#  id class1 class2 class3 class4  sugg1  sugg2
#1  A   0.98   0.48   0.21   0.99 class4 class1
#2  B   0.22   0.31   0.41   0.11 class3 class2
#3  C   0.70   0.81   0.61   0.21 class2 class1

This can also be done by melting into 'long' format, subset the first two rows after grouping by 'id'/ordering based on 'value' and then join on the original dataset

library(data.table)#v1.9.7+
df1[dcast(melt(df1, id.var = "id")[order(-value), head(variable,2) , 
       id], id ~paste0("sugg", rowid(id)), value.var = "V1"), on = "id"]
#   id class1 class2 class3 class4  sugg1  sugg2
#1:  A   0.98   0.48   0.21   0.99 class4 class1
#2:  B   0.22   0.31   0.41   0.11 class3 class2
#3:  C   0.70   0.81   0.61   0.21 class2 class1

data

df1 <- structure(list(id = c("A", "B", "C"), class1 = c(0.98, 0.22, 
0.7), class2 = c(0.48, 0.31, 0.81), class3 = c(0.21, 0.41, 0.61
), class4 = c(0.99, 0.11, 0.21)), .Names = c("id", "class1", 
"class2", "class3", "class4"), class = "data.frame",
row.names = c(NA, -3L))

Awesome! Because of the transposing detail :) One thinking, why `apply()` produces column wise. Is it default for it? or is it this time you wanted two values for each row? — Sowmya S. Manian, Aug 20 '16 at 18:39
@SowmyaS.Manian Thanks for the kind words. Yes, with `MARGIN = 1`, the row/columns gets transposed, which we transpose it back before assigning to the new columns. — akrun, Aug 20 '16 at 18:40
@SowmyaS.Manian Also, you can check [here](http://stackoverflow.com/questions/9521260/why-apply-returns-a-transposed-xts-matrix) for the transposing problem. — akrun, Aug 20 '16 at 18:41

R: Get column name based on row values in R

1 Answers1

data