1

I have a dataset which looks like the following-

cluster1      cluster2      cluster3
0.0795604798  0.0934697636 -0.396044650
0.0086171605 -0.1467907623 -0.396044650
1.8838058726 -0.1507548515 -0.396044650

I want to get the cluster number for each row, which will be the maximum value. It should look something like this-

value           cluster
0.0934697636    cluster2
0.0086171605    cluster1
1.8838058726    cluster1

I'm doing it in R. If I use the following command, I get the maximum observation from each row-

k=colnames(apply(cscore,1,max))

But I'm not sure how to get the cluster name as well.

Mridul Garg
  • 477
  • 1
  • 8
  • 17

1 Answers1

4

We can use max.col to get the column index of maximum value for each row, cbind it with the row index to extract the maximum 'value' in each row ('value'), and use the column index to get the corresponding column names ('cluster').

j1 <- max.col(df1, "first")
value <- df1[cbind(1:nrow(df1), j1)]
cluster <- names(df1)[j1]
res <- data.frame(value, cluster)
res
#       value  cluster
#1 0.093469764 cluster2
#2 0.008617161 cluster1
#3 1.883805873 cluster1
akrun
  • 874,273
  • 37
  • 540
  • 662