I have a function in R which uses case_when:
myfunction <- function(df, col, case_name, cntl_name) {
object <- df %>%
mutate(
class = case_when(
col == case_name ~ 1,
col == cntl_name ~ 0,
)
)
return(object)
}
So if I have this object:
df <- structure(list(id = c("ID1", "ID2",
"ID3", "ID4", "ID5"
), phenotype = c("blue", "blue", "red",
"green", "red"), treatment = c("treat1", "treat2",
"none", "none", "none"), weeks_of_treatment = c(0, 0, 0, 0, 0
)), row.names = c("ID1", "ID2",
"ID3", "ID4", "ID5"
), class = "data.frame")
> df
id phenotype treatment weeks_of_treatment
ID1 ID1 blue treat1 0
ID2 ID2 blue treat2 0
ID3 ID3 red none 0
ID4 ID4 green none 0
ID5 ID5 red none 0
And run:
newdf <- myfunction(df, "phenotype", "red", "blue")
It should return a dataframe that looks like this:
id phenotype treatment weeks_of_treatment class
1 ID1 blue treat1 0 0
2 ID2 blue treat2 0 0
3 ID3 red none 0 1
4 ID4 green none 0 NA
5 ID5 red none 0 1
But it doesn't - it returns this:
> newdf
id phenotype treatment weeks_of_treatment class
1 ID1 blue treat1 0 NA
2 ID2 blue treat2 0 NA
3 ID3 red none 0 NA
4 ID4 green none 0 NA
5 ID5 red none 0 NA
It does not recognise the variable as col
as the column phenotype
. Does anyone know how to input a dynamic variable into case_when
?
I have tried other solutions for variables in dplyr (eg, using double brackets around col [[col]]
) but I can't find something that works.