0

I have a data frame of 5 columns with probability values.

The 5 columns are the target variable values I want one additional column that tags the maximum probability for each row with the target value. help?

For example:

  id    columnA columnB columnC columnD FinalTag 
  1112  0.653   0.33    0.01    0.006    "A"
Pierre L
  • 28,203
  • 6
  • 47
  • 69

1 Answers1

2

If I understood your question correctly, this would be one possibility. There might be simpler ways to obtain this result, though.

df1$FinalTag <- gsub("column", "", names(sapply(1:nrow(df1),
                                        function(x) which.max(df1[x,-1]))))

#     id columnA columnB columnC columnD FinalTag
# 1 1112   0.653   0.330    0.01   0.006        A
# 2 1114   0.234   0.581    0.10   0.085        B

Edit

As suggested by @DavidArenburg the same result can in fact be obtained in a more compact form:

df1$FinalTag <- sub("column", "", names(df1)[-1][max.col(df1[-1])])

data

df1 <- structure(list(id = c(1112L, 1114L), columnA = c(0.653, 0.234), 
        columnB = c(0.33, 0.581), columnC = c(0.01, 0.1), 
        columnD = c(0.006, 0.085)), 
        .Names = c("id", "columnA", "columnB", "columnC", "columnD"),
        class = "data.frame", row.names = c(NA, -2L))
RHertel
  • 23,412
  • 5
  • 38
  • 64
  • @DavidArenburg Yes, it seems quite similar, but I'm not sure it's really a duplicate... Although the programming methods used in the answer are very similar, the question of the OP appears to be different. – RHertel Mar 24 '16 at 19:42