-1

I have a data frame with four columns :

df=data.frame( UserId=c(1,2,2,2,3,3), CatoId=c('C','A','B','C','D','E'), No=c(1,9,2,2,5,3))  

UserId CatoId No  
1       C     1  
2       A     9  
2       B     2  
2       C     2  
3       D     5  
3       E     3

I would like to transform the structure into the following one :

UserId A B C D E  
  1    0 0 1 0 0  
  2    9 2 2 0 0 
  3    0 0 0 5 3

Where the columns represents all possible values in CatoId.
The first data frame has 2 million rows and CatoId has 21 different values. So I don't want to use any loops. Is there a way to do this with R. Otherwise what is the best way to proceed?
My goal would be to apply a clustering algorithm on the last dataframe.

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
P_Sta
  • 55
  • 1
  • 10
  • 1
    http://stackoverflow.com/questions/5890584/how-to-reshape-data-from-long-to-wide-format or http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix-long-to-wide-format – jogo Mar 21 '17 at 19:15

1 Answers1

0

You can do this using dcast:

df1 <- dcast(df, UserId ~ CatoId, value.var = "No", fill = 0)
tbradley
  • 2,210
  • 11
  • 20