-3

I would like to convert this column into binary columns for each breed (1 dog is breed, 0 dog is not that breed)

enter image description here

Sam P
  • 113
  • 1
  • 4
  • 8
  • Do not post your data as an image, please learn how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610) – Jaap Jun 25 '17 at 16:00

2 Answers2

3

Use model.matrix() to convert your categorical variable in binary variables.

Breed = c(
  "Sheetland Sheepdog Mix",
  "Pit Bull Mix",
  "Lhasa Aposo/Miniature",
  "Cairn Terrier/Chihuahua Mix",
  "American Pitbull",
  "Cairn Terrier",
  "Pit Bull Mix"
)
df=data.frame(Breed)

dfcat = data.frame(model.matrix(~ df$Breed-1, data=df))
names(dfcat) = levels(df$Breed)

So dfcat contains your binary variables:

dfcat
#American Pitbull Cairn Terrier Cairn Terrier/Chihuahua Mix Lhasa Aposo/Miniature Pit Bull Mix Sheetland Sheepdog Mix
#              0             0                           0                     0            0                      1
#              0             0                           0                     0            1                      0
#              0             0                           0                     1            0                      0
#              0             0                           1                     0            0                      0
#              1             0                           0                     0            0                      0
#              0             1                           0                     0            0                      0
#              0             0                           0                     0            1                      0
FAMG
  • 395
  • 1
  • 9
  • I just tried your recommendation. But it always gives me the name of the category with Breed in front (e.g., BreedAmericanPitbull). Would be nice to have a way without renameing. – FAMG Jul 06 '17 at 11:29
  • yup, youre right - i was talking mince – user20650 Jul 06 '17 at 11:37
1

One way could be using unique with a for-loop

Breed = c(
  "Sheetland Sheepdog Mix",
  "Pit Bull Mix",
  "Lhasa Aposo/Miniature",
  "Cairn Terrier/Chihuahua Mix",
  "American Pitbull",
  "Cairn Terrier",
  "Pit Bull Mix"
)
df=data.frame(Breed)

for (i in unique(df$breed)){
  df[,paste0(i)]=ifelse(df$Breed==i,1,0)
}
Bea
  • 1,110
  • 12
  • 20