-3

I have a table like below, would like to crate suggestions based on row value in R.

This is what I have -

id  class1  class2  class3  class4
A   0.98    0.48    0.21    0.99
B   0.22    0.31    0.41    0.11
C   0.70    0.81    0.61    0.21

I would like to have two new columns ('sugg1', 'sugg2') that will give the column names of the two top maximum values for each row i.e. for the first row, 0.99 is the maximum value, so its corresponding column name is class4, and the next max value is 0.98 for which the column name is class1.

id  class1  class2  class3  class4  sugg1   sugg2
A   0.98    0.48    0.21    0.99    class4  class1
B   0.22    0.31    0.41    0.11    class3  class2
C   0.70    0.81    0.61    0.21    class2  class1
akrun
  • 874,273
  • 37
  • 540
  • 662

1 Answers1

1

We can use apply with MARGIN = 1 to loop over the rows, sort the values in the rows decreasing, get the first 2 (head(...)), transpose the output and create two new columns in the original dataset.

df1[paste0("sugg", 1:2)] <- t(apply(df1[-1], 1, FUN = function(x) names(head(sort(-x),2))))

df1
#  id class1 class2 class3 class4  sugg1  sugg2
#1  A   0.98   0.48   0.21   0.99 class4 class1
#2  B   0.22   0.31   0.41   0.11 class3 class2
#3  C   0.70   0.81   0.61   0.21 class2 class1

This can also be done by melting into 'long' format, subset the first two rows after grouping by 'id'/ordering based on 'value' and then join on the original dataset

library(data.table)#v1.9.7+
df1[dcast(melt(df1, id.var = "id")[order(-value), head(variable,2) , 
       id], id ~paste0("sugg", rowid(id)), value.var = "V1"), on = "id"]
#   id class1 class2 class3 class4  sugg1  sugg2
#1:  A   0.98   0.48   0.21   0.99 class4 class1
#2:  B   0.22   0.31   0.41   0.11 class3 class2
#3:  C   0.70   0.81   0.61   0.21 class2 class1

data

df1 <- structure(list(id = c("A", "B", "C"), class1 = c(0.98, 0.22, 
0.7), class2 = c(0.48, 0.31, 0.81), class3 = c(0.21, 0.41, 0.61
), class4 = c(0.99, 0.11, 0.21)), .Names = c("id", "class1", 
"class2", "class3", "class4"), class = "data.frame",
row.names = c(NA, -3L))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    short and simple.. let me try this out :) – Sambit Nandi Aug 20 '16 at 18:19
  • Awesome! Because of the transposing detail :) One thinking, why `apply()` produces column wise. Is it default for it? or is it this time you wanted two values for each row? – Sowmya S. Manian Aug 20 '16 at 18:39
  • 1
    @SowmyaS.Manian Thanks for the kind words. Yes, with `MARGIN = 1`, the row/columns gets transposed, which we transpose it back before assigning to the new columns. – akrun Aug 20 '16 at 18:40
  • 1
    @SowmyaS.Manian Also, you can check [here](http://stackoverflow.com/questions/9521260/why-apply-returns-a-transposed-xts-matrix) for the transposing problem. – akrun Aug 20 '16 at 18:41