Transform multiple rows of a data frame into one row with multiple columns with R

Question

I have a data frame with four columns :

df=data.frame( UserId=c(1,2,2,2,3,3), CatoId=c('C','A','B','C','D','E'), No=c(1,9,2,2,5,3))  

UserId CatoId No  
1       C     1  
2       A     9  
2       B     2  
2       C     2  
3       D     5  
3       E     3

I would like to transform the structure into the following one :

UserId A B C D E  
  1    0 0 1 0 0  
  2    9 2 2 0 0 
  3    0 0 0 5 3

Where the columns represents all possible values in CatoId.
The first data frame has 2 million rows and CatoId has 21 different values. So I don't want to use any loops. Is there a way to do this with R. Otherwise what is the best way to proceed?
My goal would be to apply a clustering algorithm on the last dataframe.

http://stackoverflow.com/questions/5890584/how-to-reshape-data-from-long-to-wide-format or http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix-long-to-wide-format — jogo, Mar 21 '17 at 19:15

score 0 · Answer 1 · answered Mar 21 '17 at 19:25

0

You can do this using dcast:

df1 <- dcast(df, UserId ~ CatoId, value.var = "No", fill = 0)

answered Mar 21 '17 at 19:25

tbradley

2,210
11
20

Transform multiple rows of a data frame into one row with multiple columns with R

1 Answers1

Linked

Related