This seems to be a fairly simple task, but I couldn't figure it out after studying the documentation of ifelse()
, dplyr::if_else()
and several similar posts on SO about applying ifelse()
to multiple columns in a data frame.
My goal: I have the following data frame with columns of different data types. On each row, I want to reset the values in the first 3 columns to NA, if Column "valid" indicates false.
The problem: I used dplyr::across()
and ifelse()
to change the values as I wanted, but the date Column date
and factor Column team
were coerced to numeric (as shown in the reprex below), which wasn't desirable. I know that dplyr::if_else()
preserves data types, but it doesn't work across columns of different data types, either.
I know tdf[tdf$valid == FALSE, !grepl("valid", names(tdf))] <- NA
could achieve my goal, but I prefer a tidyverse approach, which I could use in my data cleaning pipeline. Many thanks in advance!
library(dplyr)
tdf <- tibble(
date = c(as.Date("2021-12-10"), as.Date("2021-12-11")),
team = factor(1:2, labels = c("T1", "T2")),
score = 3:4,
valid = c(TRUE, FALSE)
)
tdf
#> # A tibble: 2 x 4
#> date team score valid
#> <date> <fct> <int> <lgl>
#> 1 2021-12-10 T1 3 TRUE
#> 2 2021-12-11 T2 4 FALSE
tdf %>% mutate(across(-valid, ~ ifelse(valid, ., NA)))
#> # A tibble: 2 x 4
#> date team score valid
#> <dbl> <int> <int> <lgl>
#> 1 18971 1 3 TRUE
#> 2 NA NA NA FALSE
Created on 2021-12-10 by the reprex package (v2.0.1)