I'm working with a dataset that has many columns called status1, status2, etc. Within those columns, it says if someone is exempt, complete, registered, etc.
Unfortunately, the exempt inputs are not consistent; here's a sample:
library(dplyr)
problem <- tibble(person = c("Corey", "Sibley", "Justin", "Ruth"),
status1 = c("7EXEMPT", "Completed", "Completed", "Pending"),
status2 = c("exempt", "Completed", "Completed", "Pending"),
status3 = c("EXEMPTED", "Completed", "Completed", "ExempT - 14"))
I'm trying to use case_when() to make a new column that has their final status. If it ever says completed, then they are completed. If it ever says exempt without saying complete, then they are exempt.
The important part is that I want my code to use contains("status"), or some equivalent that only targets the status columns and doesn't require typing them all, and I want it to only require a partial string match for exempt.
As for using contains with case_when, I saw this example, but I wasn't able to apply it to my case: mutate with case_when and contains
This is what I've tried to use so far, but as you can guess, it has not worked:
library(purrr)
library(dplyr)
library(stringr)
solution <- problem %>%
mutate(final= case_when(pmap_chr(select(., contains("status")), ~
any(c(...) == str_detect(., "Exempt") ~ "Exclude",
TRUE ~ "Complete"
))))
Here's what I want the final product to look like:
solution <- tibble(person = c("Corey", "Sibley", "Justin", "Ruth"),
status1 = c("7EXEMPT", "Completed", "Completed", "Pending"),
status2 = c("exempt", "Completed", "Completed", "Pending"),
status3 = c("EXEMPTED", "Completed", "Completed", "ExempT - 14"),
final = c("Exclude", "Completed", "Completed", "Exclude"))
Thank you!