1

This is similar to this question R Convert row data to binary columns but I want to preserve the number of rows.

How can I convert the row data to binary columns while preserving the number of rows?

Example

Input

myData<-data.frame(gender=c("man","women","child","women","women","women","man"),
                   age=c(22, 22, 0.33,22,22,22,111))


 myData
   gender    age
 1    man  22.00
 2  women  22.00
 3  child   0.33
 4  women  22.00
 5  women  22.00
 6  women  22.00
 7    man 111.00

How to get to this intended output?

   gender    age    man   women  child  
 1    man  22.00    1     0      0
 2  women  22.00    0     1      0
 3  child   0.33    0     0      1
 4  women  22.00    0     1      0
 5  women  22.00    0     1      0
 6  women  22.00    0     1      0
 7    man 111.00    1     0      0
zx8754
  • 52,746
  • 12
  • 114
  • 209
hhh
  • 50,788
  • 62
  • 179
  • 282

2 Answers2

5

Perhaps a slightly easier solution without reliance on another package:

data.frame(myData, model.matrix(~gender+0, myData))

jkt
  • 946
  • 1
  • 7
  • 18
  • Why does this require the `+0` otherwise `X.Intercept`? – hhh Jun 22 '17 at 12:39
  • 1
    @hhh `+0` removes the intercept and thus allows the reference category to be represented as a dummy variable. The reference category is by default the first value, i.e. `child` in your case as it is the first in the alphabetical order of `gender` categories. – jkt Jun 22 '17 at 13:10
1

We can use dcast to do this

library(data.table)
dcast(setDT(myData), gender + age + seq_len(nrow(myData)) ~ 
                             gender, length)[, myData := NULL][]

Or use table from base R and cbind with the original dataset

cbind(myData, as.data.frame.matrix(table(1:nrow(myData), myData$gender)))
akrun
  • 874,273
  • 37
  • 540
  • 662