I am importing data into R from another source (i.e., I cannot easily change the in-coming format/values).
Among the variables is one that include one or more of these possible values:
- Mother (biological mother, foster mother, step mother, etc.)
- Father (biological father, foster father, step father, etc.)
- Grandparent(s) (biological, foster, step, etc.)
- Brother(s) older than 18
- Sister(s) older than 18
- Other adults (aunts, uncles, etc.)
all within the same "cell" so that possible data look like:
Sample Input Data Frame (df)
df <- read.table(text =
"row lives.with.whom
1 'Mother (biological mother, foster mother, step mother, etc.), Father (biological father, foster father, step father, etc.), Grandparent(s) (biological, foster, step, etc.), Brother(s) older than 18, Sister(s) older than 18, Other adults (aunts, uncles, etc.)'
2 ''
3 'Mother (biological mother, foster mother, step mother, etc.), Sister(s) older than 18'
4 'Mother (biological mother, foster mother, step mother, etc.), Father (biological father, foster father, step father, etc.)'", header = T)
Within R
, how could I efficiently create rules to parse out these responses into separate columns, one column for each type of family member, so that the output would look like this:
Sample Output Data Frame
mother <- c(1,0,1,1)
father <- c(1,0,0,1)
adult.brother <- c(1,0,0,0)
adult.sister <- c(1,0,1,0)
grandparent <- c(1,0,0,0)
other.adult <- c(1,0,0,0)
output.df <- cbind(mother, father, adult.brother, adult.sister, grandparent, other.adult)
colnames(output.df) <- c("Mother", "Father", "Brother", "Sister", "Grandparent", "Other adult")
output.df
Mother Father Brother Sister Grandparent Other adult
[1,] 1 1 1 1 1 1
[2,] 0 0 0 0 0 0
[3,] 1 0 0 1 0 0
[4,] 1 1 0 0 0 0
TIA