0

I am trying to assign classifications but I'm running into some problems. The normal classification method takes the majority of the votes, but I want to be a bit more strict. Lets say I've got the following matrix:

     c1    c2    c3
x1   0.09  0.7   0.21
x2   0.34  0.33  0.33

If I take the majority of the votes, the classification will be as follows:

     class
x1   c2
x2   c1

But I want to set the threshold to eg 0.40 votes, so that I would get these classifications:

     class
x1   c2
x2   unassigned

I know how to get the max in a row and how to get the column name that holds the max in that row (from this issue, but it doesn't solve mine), but for some reason I can't seem to query the max to be atleast 0.40. Any help would be appreciated :)

kleurless
  • 5
  • 2

2 Answers2

0

You can use max.col to get maximum value in the row.

cols <- names(df)[max.col(df) * NA^!rowSums(df > 0.4) > 0]
cols[is.na(cols)] <- 'unassigned'
cols
#[1] "c2"         "unassigned"

NA^!rowSums(df > 0.4) > 0 part is to return NA for those rows that have no value > 0.4.

data

df <- structure(list(c1 = c(0.09, 0.34), c2 = c(0.7, 0.33), c3 = c(0.21, 
0.33)), class = "data.frame", row.names = c("x1", "x2"))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

I would suggest this approach with apply():

#Function
myfun <- function(x)
{
  y <- names(x)[which(x==max(x[which(x>0.4)]))]
  y2 <- y[1]
  if(is.na(y2))
  {
    y2 <- 'not assigned'
  }
    
  return(as.character(y2))
}
#Apply
df$Class <- apply(df,1,myfun)

Output:

     c1   c2   c3        Class
x1 0.09 0.70 0.21           c2
x2 0.34 0.33 0.33 not assigned
Duck
  • 39,058
  • 13
  • 42
  • 84