Adding the value that most appears in each row of a data.frame

Question

I have the following dataframe:

> head (PRED_BEST_SYSM_TEST2)
  V1 V2 V3 V4 V5
1  0  0  0  0  0
2  2  2  2  1  1
3  0  0  0  0  0
4  0  3  4  0  0
5  5  5  1  2  0
6  0  0  0  1  1

I would like to add column to the dataframe that will contain the number the appears most times in each row. As followed:

  V1 V2 V3 V4 V5 max_res
1  0  0  0  0  0    0
2  2  2  2  1  1    2
3  0  0  0  0  0    0
4  0  3  4  0  0    0
5  5  5  1  2  0    5
6  0  0  1  1  1    1

I use the following code:

g <- function(df)
{
  X <- as.data.frame(t(apply( df, 1,
                              function(row)
                              {
                                u <- unique(row)
                                n <- rowSums(outer(u,row,"=="))
                                if (length(u)==1 )
                                {
                                  c(row,u[which.max(n)],max(n),"",0)
                                }
                                else
                                {
                                  c(row,u[which.max(n)],max(n))
                                }
                              })))  

  colnames(X) <- c(colnames(df),"max_res")

  return(X)
}

g1<-g(PRED_BEST_SYSM_TEST2)

When I try to >head (g1) I get very weird results such as:

  NA                  NA                  NA                  NA                  NA
                        NA                  NA                  NA                  NA                  NA
                        NA                  NA                  NA                  NA                  NA
                   NA                  NA                  NA                  NA                  NA                  NA
                   NA                  NA                  NA                  NA                       NA
                   NA                  NA                  NA                  NA                       NA
                   NA                  NA                  NA                  NA                       NA
                   NA                  NA                  NA                       NA                  NA

The PRED_BEST_SYSM_TEST2 dataframe details are:

 > str (PRED_BEST_SYSM_TEST2)
'data.frame':   100000 obs. of  5 variables:
 $ V1: Factor w/ 10 levels "0","1","2","3",..: 1 1 1 1 1 1 1 2 1 2 ...
 $ V2: Factor w/ 10 levels "0","1","2","3",..: 1 1 1 1 1 1 2 2 1 2 ...
 $ V3: Factor w/ 10 levels "0","1","2","3",..: 1 1 1 1 1 1 1 2 1 1 ...
 $ V4: Factor w/ 10 levels "0","1","2","3",..: 1 2 1 1 1 2 1 2 1 2 ...
 $ V5: Factor w/ 10 levels "0","1","2","3",..: 1 2 1 1 1 2 2 2 1 1 ...

Thanks @Cath! Is there a way to convert all the dataframe into numeric dataframe? — Avi, Nov 10 '17 at 13:21
This is how it is created: PRED_BEST_SYSM_TEST2 = matrix(0, nrow(testing2), 5) PRED_BEST_SYSM_TEST2<-as.data.frame (PRED_BEST_SYSM_TEST2) for (i in 1:(5)) { PRED_BEST_SYSM_TEST2[,i]<- (predict(cart.models[BEST_SYSM_TREES_TRAIN[[i]]], testing2[,c(1:10)],type='class'))} — Avi, Nov 10 '17 at 13:24
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/158694/discussion-between-avi-and-cath). — Avi, Nov 10 '17 at 13:32
@Cath! This is not duplicate! the list of Q&A are for maximal values! My is for maximal number of occurrence! — Avi, Nov 10 '17 at 13:49

Adding the value that most appears in each row of a data.frame

0 Answers0