Rencent versions of dplyr deprecate underscore versions of functions, such as filter_, in favour of tidy evaluation.
What is expected new form of the underscore forms with the new way? How do I write avoiding undefined symbols with R CMD check?
library(dplyr)
df <- data_frame(id = rep(c("a","b"), 3), val = 1:6)
df %>% filter_(~id == "a")
# want to avoid this, because it references column id in a variable-style
df %>% filter( id == "a" )
# option A
df %>% filter( UQ(rlang::sym("id")) == "a" )
# option B
df %>% filter( UQ(as.name("id")) == "a" )
# option C
df %>% filter( .data$id == "a" )
Is there a preferred or more conside form? Option C is shortest but is slower on some of my real-world larger datasets and more complex dplyr constructs:
microbenchmark(
sym = dsPClosest %>%
group_by(!!sym(dateVarName), !!sym("depth")) %>%
summarise(temperature = mean(!!sym("temperature"), na.rm = TRUE)
, moisture = mean(!!sym("moisture"), na.rm = TRUE)) %>%
ungroup()
,data = dsPClosest %>%
group_by(!!sym(dateVarName), .data$depth ) %>%
summarise(temperature = mean(.data$temperature , na.rm = TRUE)
, moisture = mean(.data$moisture , na.rm = TRUE)) %>%
ungroup()
,times=10
)
#Unit: milliseconds
# expr min lq mean median uq max neval
# sym 80.05512 84.97267 122.7513 94.79805 100.9679 392.1375 10
# data 4652.83104 4741.99165 5371.5448 5039.63307 5471.9261 7926.7648 10
There is another answer for mutate_ using even more complex syntax.