18

With dplyr starting version 0.7 the methods ending with underscore such as summarize_ group_by_ are deprecated since we are supposed to use quosures.

See: https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html

I am trying to implement the following example using quo and !!

Working example:

df <- data.frame(x = c("a","a","a","b","b","b"), y=c(1,1,2,2,3,3), z = 1:6)

lFG <- df %>% 
   group_by( x,y) 
lFG %>% summarize( min(z))

However, in the case, I need to implement the columns to group by and summarize are specified as strings.

cols2group <- c("x","y")
col2summarize <- "z"

How can I get the same example as above working?

zx8754
  • 52,746
  • 12
  • 114
  • 209
witek
  • 984
  • 1
  • 8
  • 25

4 Answers4

19

For this you can now use _at versions of the verbs

df %>%  
  group_by_at(cols2group) %>% 
  summarize_at(.vars = col2summarize, .funs = min)

Edit (2021-06-09):

Please see Ronak Shah's answer, using

mutate(across(all_of(cols2summarize), min))

Now the preferred option

xilliam
  • 2,074
  • 2
  • 15
  • 27
Robin Gertenbach
  • 10,316
  • 3
  • 25
  • 37
11

From dplyr 1.0.0 you can use across :

library(dplyr)

cols2group <- c("x","y")
col2summarize <- "z"

df %>%
  group_by(across(all_of(cols2group))) %>%
  summarise(across(all_of(col2summarize), min)) %>%
  ungroup

#   x       y     z
#  <chr> <dbl> <int>
#1 a         1     1
#2 a         2     3
#3 b         2     4
#4 b         3     5
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • 1
    Why do you need the `all_of` inside the across? I just used it without and it works as expected. And it works for Spark! – kael Jun 24 '21 at 10:17
  • 5
    It will work as expected but it will give you a warning (once per session) `Note: Using an external vector in selections is ambiguous.ℹ Use \`all_of(cols2group)\` instead of \`cols2group\` to silence this message.` – Ronak Shah Jun 24 '21 at 10:19
4

Another option is to use non-standard evaluation (NSE), and have R interpret the string as quoted names of objects:

cols2group <- c("x","y")
col2summarize <- "z"

df %>%  
  group_by(!!rlang::sym(cols2group)) %>% 
  summarize(min(!!rlang::sym(col2summarize)))

The rlang::sym() function takes the strings and turns them into quotes, which are in turn unquoted by !! and used as names in the context of df where they refer to the relevant columns. There's different ways of doing the same thing, as always, and this is the shorthand I tend to use!

jsavn
  • 701
  • 1
  • 8
  • 17
1

See ?dplyr::across for the updated way to do this since group_by_at and summarize_at are now Superseded

Nicolas Molano
  • 693
  • 4
  • 15