Replacing group_by_ with group_by when the argument is a string in dplyr

Question

I have some code that specifies a grouping variable as a string.

group_var <- "cyl"

My current code for using this grouping variable in a dplyr pipeline is:

mtcars %>% 
     group_by_(group_var) %>% 
     summarize(mean_mpg = mean(mpg))

My best guess as to how to replace the deprecated group_by_ function with group_by is:

mtcars %>% 
     group_by(!!as.name(group_var)) %>% 
     summarize(mean_mpg = mean(mpg))

This works but is not explicitly mentioned in the programming with dplyr vignette.

Is using !!as.name() the preferred way to replace group_by_() with group_by()?

Another option is `group_by_at` - `group_by_at(mtcars, group_var)`. — aosmith, Nov 02 '17 at 17:51
This might be helpful: https://stackoverflow.com/questions/47056091/arrange-doesnt-recognize-column-name-parameter/47056273#47056273 — acylam, Nov 02 '17 at 19:35
You can also use `library(rlang); group_by(!!parse_quosure(group_var))` — acylam, Nov 02 '17 at 19:41
For context this is in a shiny app and the grouping variable is a user input contained in the variable `input$group_var`. — Adam Black, Nov 02 '17 at 19:41
@useR Thanks. I think `parse_quosure()` is the function I'm after. `group_by_at` works but doesn't generalize to solving this problem with other tidyverse functions. — Adam Black, Nov 02 '17 at 19:54
Is there any reason to use `parse_quosure()` over `as.name()`? — Adam Black, Nov 02 '17 at 19:55

Dave Gruenewald · Accepted Answer · 2017-11-02T19:54:53.993

6

Is this within a function? Otherwise I think the !!as.name() part is unnecessary and I would stick with the group_by_at(group_var) suggestion by @aosmith for simplicity sake. Otherwise, I would set it up as so:

examplr <- function(data, group_var){
  group_var <- as.name(group_var)

  data %>% 
    group_by(!!group_var) %>% 
    summarize(mean_mpg = mean(mpg))
}

examplr(data = mtcars,
        group_var = "cyl")

edited Nov 02 '17 at 19:54

answered Nov 02 '17 at 18:11

Dave Gruenewald

5,329
1
23
35

1

Any reason not to do this `group_by(!!as.name(group_var))`? – Adam Black Nov 02 '17 at 19:45
No reason immediately comes to mind, but since you asked for preferred method, I figured `group_by_at()` is simpler than `group_by(!!as.name())`. But if you plan on using `group_var` multiple times in your `dplyr` pipeline, I would recommend using the function approach I outlined above. That way, you will only need to call on your variable by `!!group_var` rather than `!!as.name(group_var)` each time. – Dave Gruenewald Nov 02 '17 at 19:51
2

You can actually use `group_by_at` in a function and pass the variable name as a string directly to it (no "unquoting" needed). – aosmith Nov 02 '17 at 19:51
4

All this new quoting and unquoting stuff is making my head spin. New to me anyway. – Adam Black Nov 02 '17 at 19:59

Replacing group_by_ with group_by when the argument is a string in dplyr

1 Answers1