13

Relatively new to tidy evaluation and while the functions I'm making work, I want to know why different helper functions are used. For example, what is the difference between enquo and ensym? In the function I made below to capture daily average and moving average they're interchangeable:

library(dplyr)
library(lubridate)
library(rlang)
library(zoo)

manipulate_for_ma <- function(data, group_var, da_col_name, summary_var, ma_col_name) {
  group_var <- ensym(group_var) 
  summary_var <- enquo(summary_var)
  da_col_name <- ensym(da_col_name) 
  ma_col_name <- enquo(ma_col_name)
  
  data %>% 
    group_by(!!group_var) %>%
    summarise(!!da_col_name := mean(!!summary_var, na.rm = TRUE)) %>% 
    mutate(!!ma_col_name := rollapply(!!da_col_name,
                                      30,
                                      mean,
                                      na.rm = TRUE,
                                      partial = TRUE,
                                      fill = NA)) %>% 
    rename(date = !!group_var)
}

lakers %>%
 mutate(date = ymd(date)) %>%
 manipulate_for_ma(group_var = date,
                   da_col_name = points_per_play_da,
                   summary_var = points,
                   points_per_play_ma)

# A tibble: 78 x 3
   date       points_per_play_da points_per_play_ma
   <date>                  <dbl>              <dbl>
 1 2008-10-28              0.413              0.458
 2 2008-10-29              0.431              0.459
 3 2008-11-01              0.408              0.456
 4 2008-11-05              0.386              0.457

I've read about enquo here and ensym here. Is the difference that ensym is more restrictive and only takes strings or string-like objects?

Calimo
  • 7,510
  • 4
  • 39
  • 61
Ben G
  • 4,148
  • 2
  • 22
  • 42
  • If it only takes strings, then how is it working with `group_var = date` – akrun Sep 16 '19 at 15:50
  • 2
    @akrun--good point, further emphasizing my lack of understanding here. This is what Hadley says in advanced R: Sometimes you only want to allow the user to specify a variable name, not an arbitrary expression. In this case, you can use `ensym()` or `ensyms()`. These are variants of `enexpr()` and `enexprs()` that check the captured expression is either symbol or a string (which is converted to a symbol67).`ensym()` and `ensyms()` throw an error if given anything else. – Ben G Sep 16 '19 at 15:51
  • It would take both strings and non-strings. You can change `date` to `"date"` and check with `ensym` and `enquo` The former works, while the latter gives error – akrun Sep 16 '19 at 15:53
  • @akrun-so is the only difference that `ensym` allows you to pass strings as arguments *as well as* symbols? – Ben G Sep 16 '19 at 15:59
  • 3
    By looking at the help page `ensym() and ensyms() are variants of enexpr() and enexprs() that check the captured expression is either a string (which they convert to symbol) or a symbol.` `quo and enquo() are similar to their expr counterparts but capture both the expression and its environment in an object called a quosure.` – akrun Sep 16 '19 at 16:06
  • 1
    Also note that when programming such functions with dplyr, one is probably better off using the new curly-curly style: `manipulate_for_ma <- function(data, group_var, da_col_name, summary_var, ma_col_name) { data %>% group_by({{ group_var }}) %>% summarise({{ da_col_name }} := mean({{ summary_var }}, na.rm = TRUE)) %>% ... }` . Reference: https://www.tidyverse.org/articles/2019/06/rlang-0-4-0/ – Aurèle Sep 18 '19 at 12:18

2 Answers2

5

Another take :

library(rlang)
library(dplyr, warn.conflicts = FALSE)

test <- function(x){
  Species <- "bar"
  cat("--- enquo builds a quosure from any expression\n")
  print(enquo(x))
  cat("--- ensym captures a symbol or a literal string as a symbol\n")
  print(ensym(x))
  cat("--- evaltidy will evaluate the quosure in its environment\n")
  print(eval_tidy(enquo(x)))
  cat("--- evaltidy will evaluate a symbol locally\n")
  print(eval_tidy(ensym(x)))
  cat("--- but both work fine where the environment doesn't matter\n")
  identical(select(iris,!!ensym(x)), select(iris,!!enquo(x)))
}

Species = "foo"
test(Species)
#> --- enquo builds a quosure from any expression
#> <quosure>
#> expr: ^Species
#> env:  global
#> --- ensym captures a symbol or a literal string as a symbol
#> Species
#> --- evaltidy will evaluate the quosure in its environment
#> [1] "foo"
#> --- evaltidy will evaluate a symbol locally
#> [1] "bar"
#> --- but both work fine where the environment doesn't matter
#> [1] TRUE

test("Species")
#> --- enquo builds a quosure from any expression
#> <quosure>
#> expr: ^"Species"
#> env:  empty
#> --- ensym captures a symbol or a literal string as a symbol
#> Species
#> --- evaltidy will evaluate the quosure in its environment
#> [1] "Species"
#> --- evaltidy will evaluate a symbol locally
#> [1] "bar"
#> --- but both work fine where the environment doesn't matter
#> [1] TRUE
test(paste0("Spec","ies"))
#> --- enquo builds a quosure from any expression
#> <quosure>
#> expr: ^paste0("Spec", "ies")
#> env:  global
#> --- ensym captures a symbol or a literal string as a symbol
#> Only strings can be converted to symbols

Created on 2019-09-23 by the reprex package (v0.3.0)

moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
  • 1
    Yes, that makes sense. Thanks! – Ben G Sep 23 '19 at 17:54
  • 2
    @BenG: Not to make things more complicated, but there's also `enexpr`, which is somewhere between `ensym` and `enquo`: 1) Both `enexpr` and `enquo` allow for arbitrary expressions, while `ensym` only allows for "variable name"-like expression; 2) Both `enexpr` and `ensym` do NOT capture the environment, making their evaluation context-dependent, while `enquo` expressions will always evaluate the same way everywhere. – Artem Sokolov Sep 23 '19 at 18:15
  • 1
    Indeed, `enexpr()` is like `ensym()` on symbols, but `enexpr()` allows complex expressions unlike `ensym()`, because of this lack of restriction it cannot guess that when we feed a string literal we mean a symbol. `enexpr()` is a lot like the single argument `substitute()` excepts that it supports the unquoting via `!!` in the function arguments. – moodymudskipper Sep 23 '19 at 19:05
2

Here is one example illustrating one difference (namely that enquo captures the calling environment and ensym doesn't). Hopefully it speaks for itself:

library(rlang)

f <- function(x) {
  foo <- 42
  print(eval_tidy(quo(!! ensym(x))))
  print(eval_tidy(quo(!! enquo(x))))
}
foo <- 3
f(foo)
# [1] 42
# [1] 3

Or the slightly more convoluted:

library(rlang)

myenv <- new.env()

local(envir = myenv, {
  foo <- 17
  g <- function(x) {
    print(eval_tidy(quo(!! ensym(x))))
    print(eval_tidy(quo(!! enquo(x))))
  }
})

foo <- 123
myenv$g(foo)
#> [1] 17
#> [1] 123

Created on 2019-09-18 by the reprex package (v0.3.0)

The difference is often not noticeable when using dplyr since it is foolproof enough to always look up names in the context of the .data argument first.

Aurèle
  • 12,545
  • 1
  • 31
  • 49
  • I don't really get it--as it is I'm only just starting to understand quosures intuitively. I'm sure as I start to use it more it'll make more sense. One thing I'm starting to think about which I'm not sure is correct is that I have to use these `rlang` quoting functions with the end user mind. – Ben G Sep 18 '19 at 12:48
  • You don't necessarily have to use rlang. Maybe use the curly-curly syntax instead, that is one level of abstraction higher, as suggested in https://stackoverflow.com/questions/57960245/what-is-the-difference-between-ensym-and-enquo-when-programming-with-dplyr/57990624?noredirect=1#comment102392073_57960245 – Aurèle Sep 18 '19 at 13:09
  • A recommended read if you don't mind going down the rabbit hole is http://blog.obeautifulcode.com/R/How-R-Searches-And-Finds-Stuff/ . I seem to recall it's even referenced by Hadley in his book. Pointer: it has to do with the different sorts of environments that are involved with functions (enclosing, binding, calling...). Or go back a few chapters to https://adv-r.hadley.nz/environments.html#special-environments – Aurèle Sep 18 '19 at 13:12