19

I recently notices that rlang::sym doesn't seem to work in anonymous functions and I don't understand why. Here an example, it's pretty clumsy and ugly but I think it illustrates the point

require(tidyverse)
data <- tibble(x1 = letters[1:3],
               x2 = letters[4:6],
               val = 1:3)

get_it <- function(a, b){
    data %>%
        mutate(y1 = !!rlang::sym(a)) %>%
        mutate(y2 = !!rlang::sym(b)) %>%
        select(y1, y2, val)
}
get_it("x1", "x2")

This defines some toy data and a (horrible) function that essentially renames the columns based on column names. Now I can do the same thing for different combinations of a and b:

d <- tibble(x = c("x1", "x2"),
            y = c("x2", "x1"))
d %>% mutate(tmp = map2(x, y, get_it))

However, if I try to do the exact same thing with an anonymous function it doesn't work:

d %>% mutate(tmp = map2(x, y, function(a, b){
data %>%
    mutate(y1 = !!rlang::sym(a)) %>%
    mutate(y2 = !!rlang::sym(b)) %>%
    select(y1, y2, val)
}))

This fails with object 'a' not found even though the functions are exactly the same just here it is anonymous. Can anyone explain why?

Tung
  • 26,371
  • 7
  • 91
  • 115
Jonas
  • 1,639
  • 1
  • 18
  • 29
  • 1
    Hmm, real puzzler. I think it must be to do with the environment where the function is defined, but haven't been able to twig the difference... – Calum You Aug 17 '18 at 21:06
  • It may not be a bug but I'd report it as an issue on Git. – CPak Aug 17 '18 at 21:23
  • 2
    If we eliminate rlang, which isn't actually needed here, then it works: `function(a, b) data %>% mutate(y1 = .[[a]], y2 = .[[b]]) %>% select(y1, y2, val)` so it seems anonymous functions work but not rlang in them. – G. Grothendieck Aug 17 '18 at 21:23
  • 3
    Unquoting is not a function call: it always takes effect at the very first, outermost, quoting function. That's why you have to be a bit careful with anonymous functions. Unquoting happens immediately while anonymous functions denote a scope that is created later on, so there's a timing problem. – Lionel Henry Aug 18 '18 at 08:51
  • This is one of the reasons we decided to deprecate `UQ()` and `UQS()` which look too much like function calls despite having very different semantics. – Lionel Henry Aug 18 '18 at 08:52

1 Answers1

16

The issue is not anonymous functions, but the operator precedence of !!. The help page for !! states that

The !! operator unquotes its argument. It gets evaluated immediately in the surrounding context.

This implies that when you write a complex NSE expression, such as select inside mutate, the unquoting will take place in the environment of the expression as a whole. As pointed out by @lionel, the unquoting then takes precedence over other things, such as creation of anonymous function environments.

In your case the !! unquoting is done with respect to the outer mutate(), which then attempts to find column x1 inside d, not data. There are two possible solutions:

1) Pull the expression involving !! into a standalone function (as you've done in your question):

res1 <- d %>% mutate(tmp = map2(x, y, get_it))

2) Replace !! with eval to delay expression evaluation:

res2 <- d %>% mutate(tmp = map2(x, y, function(a, b){
  data %>%
    mutate(y1 = eval(rlang::sym(a))) %>%
    mutate(y2 = eval(rlang::sym(b))) %>%
    select(y1, y2, val)
}))

identical(res1, res2)       #TRUE
Artem Sokolov
  • 13,196
  • 4
  • 43
  • 74
  • This is overall a good answer but the issue is not the nested mutate. It's the unquoting inside an anonymous function, as you have shown with the `!!i` example. – Lionel Henry Aug 18 '18 at 08:50
  • Though I guess with nested mutates you could create a similar timing issue. – Lionel Henry Aug 18 '18 at 08:56
  • Thanks for the clarification, @lionel. It's always great to hear from the developers. The reason I thought it was a nesting issue was because my last example has `!!` inside an anonymous function and works fine. But you're right; it comes down to the operator precedence inside NSE expressions, rather than nesting. I've modified the language in the answer to be a bit more precise. (P.S. Big fan of your `rlang` package!) – Artem Sokolov Aug 18 '18 at 15:00
  • Based on [my related issue](https://stackoverflow.com/questions/58899541/looping-over-a-list-of-filter-expressions-problem-with-nse-in-map2-call-within) you could add to your answer that using `eval` or `eval_tidy` instead of `!!` solves the problem of precedence, which emerges especially in the context of nested `mutate`s. – TimTeaFan Nov 22 '19 at 20:33
  • @TimTeaFan: Done. Thanks for the suggestion. – Artem Sokolov Nov 22 '19 at 20:53