3

I have a medium large dataframe, for which I want to transform one column with categories to binary columns, one for each category.

At the same time, I want to keep the rest of the columns in the dataframe.

What would be the easiest way to achieve this?

Here is an example of what I want to do:

d<-data.frame(ID=c("a","b","c","d"), Gender=c("male", "male", "female","female"), Age =c(23,45,18,11))

 ID Gender Age
1  a   male  23
2  b   male  45
3  c female  18
4  d female  11

should look as d2 afterwards, so that the ID and Age columns are still there and untouched:

d2<-data.frame(ID=c("a","b","c","d"), Gender.male=c(1, 1, 0, 0), Gender.female=c(0,0,1,1), Age =c(23,45,18,11))

  ID Gender.male Gender.female Age
1  a           1             0  23
2  b           1             0  45
3  c           0             1  18
4  d           0             1  11
aldorado
  • 4,394
  • 10
  • 35
  • 46

3 Answers3

7

We can use spread

library(tidyvesre)
d %>% 
  mutate(n = 1) %>% 
  spread(Gender, n, fill = 0)
akrun
  • 874,273
  • 37
  • 540
  • 662
5

Or use dcast from reshape2

library(reshape2)
dcast(d, ID + Age ~ Gender, length)
#  ID Age female male
#1  a  23      0    1
#2  b  45      0    1
#3  c  18      1    0
#4  d  11      1    0
markus
  • 25,843
  • 5
  • 39
  • 58
4

We can use the dummies package.

library(dummies)

d2 <- dummy("Gender", d)
d3 <- cbind(d, d2)
d3$Gender <- NULL
d3
#   ID Age Genderfemale Gendermale
# 1  a  23            0          1
# 2  b  45            0          1
# 3  c  18            1          0
# 4  d  11            1          0
www
  • 38,575
  • 12
  • 48
  • 84