I would like to convert this column into binary columns for each breed (1 dog is breed, 0 dog is not that breed)
Asked
Active
Viewed 4,827 times
2 Answers
3
Use model.matrix() to convert your categorical variable in binary variables.
Breed = c(
"Sheetland Sheepdog Mix",
"Pit Bull Mix",
"Lhasa Aposo/Miniature",
"Cairn Terrier/Chihuahua Mix",
"American Pitbull",
"Cairn Terrier",
"Pit Bull Mix"
)
df=data.frame(Breed)
dfcat = data.frame(model.matrix(~ df$Breed-1, data=df))
names(dfcat) = levels(df$Breed)
So dfcat contains your binary variables:
dfcat
#American Pitbull Cairn Terrier Cairn Terrier/Chihuahua Mix Lhasa Aposo/Miniature Pit Bull Mix Sheetland Sheepdog Mix
# 0 0 0 0 0 1
# 0 0 0 0 1 0
# 0 0 0 1 0 0
# 0 0 1 0 0 0
# 1 0 0 0 0 0
# 0 1 0 0 0 0
# 0 0 0 0 1 0

FAMG
- 395
- 1
- 9
-
I just tried your recommendation. But it always gives me the name of the category with Breed in front (e.g., BreedAmericanPitbull). Would be nice to have a way without renameing. – FAMG Jul 06 '17 at 11:29
-
1
One way could be using unique
with a for-loop
Breed = c(
"Sheetland Sheepdog Mix",
"Pit Bull Mix",
"Lhasa Aposo/Miniature",
"Cairn Terrier/Chihuahua Mix",
"American Pitbull",
"Cairn Terrier",
"Pit Bull Mix"
)
df=data.frame(Breed)
for (i in unique(df$breed)){
df[,paste0(i)]=ifelse(df$Breed==i,1,0)
}

Bea
- 1,110
- 12
- 20