I‘ve imported a survey data 'df' in R and would like to convert/split a character variable 'symptom' to a set of binary variable and an another character variable according to its stored responses. 'symptom' variable records information on all responses to a multiple choices question 'What symptoms are you experiencing?'. Respondents ticked the box(es) that best describe their symptoms and corresponding options will be stored in 'symptom' as strings.
Q: What symptoms are you experiencing?
- Quickly fall into sleep, but wake up shortly
- Feel emotionally, physically weak
- Sleep paralysis. i.e., wide awake but can't move your body
- Lose weight quickly, lack of appetite
- Other, ___
Here is a reproducible data frame
df = data.frame(
id = c(1,2,3,4),
symptom = c("Quickly fall into sleep, but wake up shortly, Feel emotionally, physically weak, Sleep paralysis. i.e., wide awake but can't move your body","Feel emotionally, physically weak, Lose weight quickly, lack of appetite","Sleep paralysis. i.e., wide awake but can't move your body, Other, increased dreaming","Sleep paralysis. i.e., wide awake but can't move your body"))
For example, Mike ticked 1,2,3 and then his corresponding value in 'symptom' variable is 'Quickly fall into sleep, but wake up shortly, Feel emotionally, physically weak, Sleep paralysis. i.e., wide awake but can't move your body'. These strings are separated by commas. While someone ticked the fifth box, other symptoms are required to be written down in underlined area and stored in 'symptom' too. e.g., 'Lose weight quickly, lack of appetite, Other,increased dreaming'
I have tried to use lappy(), gsub(), grepl() but not worked.
lapply(adult$narco_cause1, gsub, pattern="Quickly fall into sleep, but wake up shortly", replacement=1)
It is expected to create 5 binary variable to denote which symptoms that respondents have. 1 == yes, 0 == no. And for those answered with 'other,' option, another character variable will be created to record these uncategorical information as strings.
Thanks in advance.
expected output https://i.stack.imgur.com/8CjbT.png