I have a data frame with one column (x1) containing string values. I am using these string values to modify the corresponding logical values of other columns in the data frame (the other columns are named after possible sub-string values in column x1's strings - i.e., 'Dog', 'Cat', 'Bird').
I've already figured out how to use strsplit() to parse each string in column x1. I also know how to mutate/modify the other three columns based on those strsplit() results.
What I'm currently stuck on is how to apply the for loop below to each row in my data frame.
x0 <- c(1,2,3,4,5)
x1 <- c("Dog, Cat", "Cat", "Dog, Bird", "Cat, Bird, Dog", "Cat, Bird")
Dog <- c(rep(FALSE, 5))
Cat <- c(rep(FALSE, 5))
Bird <- c(rep(FALSE, 5))
example_df <- data.frame(x0, x1, Dog, Cat, Bird)
for(i in 1:length(strsplit(example_df$x1) )){
example_df[[strsplit(example_df$x1)[i]]] <- TRUE
}
So for the example above, I want my code to change the first row of my data frame to have example_df$Dog and example_df$Cat to both be TRUE, but example_df$Bird would still be FALSE for that row. The second row would only have example_df$Cat as TRUE, etc.
Another note: for the example above, I've only provided three animal string values. But I'm looking for a method that will scale adequately for large number of string values. I know it's possible to use copy and paste with this method:
example_df %>%
mutate(Dog = str_contains(x1, "Dog"))
But unfortunately, this method is not going to scale well if I have 10 or 20 possible animal substring values in column x1.