data reshaping in R

Question

I’ve the following data

mydata <- data.frame(id=c(1,1,1,2,2,3,3,3,3,4,5,5,5), age=c(20,20,20,25,25,19,19,19,19,30,22,22,22), category=c("a","b","c","a","d","a","b","c","d","a","d","b","c"))

I want to reshape it to

ID  Age a   b   c   d
1   20  1   1   1   0
2   25  1   0   0   1
3   19  1   1   1   1
4   30  1   0   0   0
5   22  0   1   1   1

Basically I need to add number of binary columns = number of factors in the ‘category’ variable.

akrun · Accepted Answer · 2015-07-08T20:23:32.930

3

You can try dcast

 library(reshape2)
 dcast(mydata, id+age~category, value.var='category', length)
 #  id age a b c d
 #1  1  20 1 1 1 0
 #2  2  25 1 0 0 1
 #3  3  19 1 1 1 1
 #4  4  30 1 0 0 0
 #5  5  22 0 1 1 1

Or using dplyr/tidyr

 library(dplyr)
 library(tidyr)
 mydata %>%
       mutate(val=1) %>% 
       spread(category, val, fill=0)

Or an option suggested by @Pierre Lafortune

 do.call(data.frame,aggregate(category~id+age, mydata, table))

edited Jul 08 '15 at 20:23

answered Jul 08 '15 at 18:55

akrun

874,273
37
540
662

1

Could this work too? `aggregate(category~id+age, mydata, table)` The names are off though – Pierre L Jul 08 '15 at 20:18
@PierreLafortune Looks like it works too, but the `category` column is a matrix. So, may be wrapping with `do.call(data.frame,aggregate(category~id+age, mydata, table))` makes a proper data.frame – akrun Jul 08 '15 at 20:21
1

@PierreLafortune I added your version in the post as this is already marked as duplicate – akrun Jul 08 '15 at 20:24
1

@PierreLafortune You may need to create a sequence column by group X, i.e. `dta %>% group_by(X) %>% mutate(val=1, indx=row_number()) %>% spread(X, val, fill=0) %>% arrange(id_2, id_1) %>% select(-indx)` – akrun Jul 08 '15 at 20:30

data reshaping in R

1 Answers1