4

Data masking for group_by does not work when there is more than one grouping variable.

Pasting code below

grpByCols <- "model"

mpg%>%
  group_by(.data[[grpByCols]])

grpByCols <- c("model", "manufacturer")

mpg%>%
  group_by(.data[[grpByCols]])

The first group_by works, the second one fails.

Pasting the run output below

> grpByCols <- "model"
> 
> mpg%>%
+   group_by(.data[[grpByCols]])
# A tibble: 234 x 11
# Groups:   model [38]
   manufacturer model      displ  year   cyl trans      drv     cty   hwy fl    class  
   <chr>        <chr>      <dbl> <int> <int> <chr>      <chr> <int> <int> <chr> <chr>  
 1 audi         a4           1.8  1999     4 auto(l5)   f        18    29 p     compact
 2 audi         a4           1.8  1999     4 manual(m5) f        21    29 p     compact
 3 audi         a4           2    2008     4 manual(m6) f        20    31 p     compact
 4 audi         a4           2    2008     4 auto(av)   f        21    30 p     compact
 5 audi         a4           2.8  1999     6 auto(l5)   f        16    26 p     compact
 6 audi         a4           2.8  1999     6 manual(m5) f        18    26 p     compact
 7 audi         a4           3.1  2008     6 auto(av)   f        18    27 p     compact
 8 audi         a4 quattro   1.8  1999     4 manual(m5) 4        18    26 p     compact
 9 audi         a4 quattro   1.8  1999     4 auto(l5)   4        16    25 p     compact
10 audi         a4 quattro   2    2008     4 manual(m6) 4        20    28 p     compact
# … with 224 more rows
> 
> grpByCols <- c("model", "manufacturer")
> 
> mpg%>%
+   group_by(.data[[grpByCols]])
Error: Problem with `mutate()` input `..1`.
x Must subset the data pronoun with a string.
ℹ Input `..1` is `<unknown>`.
Run `rlang::last_error()` to see where the error occurred.
> 

Please let me know if you have any ideas to make this work

user438383
  • 5,716
  • 8
  • 28
  • 43
guna
  • 1,148
  • 1
  • 10
  • 18
  • You can group this way `mpg %>% group_by(.[,grpByCols])`, as well. – Kat Aug 22 '21 at 16:23
  • @guna I've changed the title to something more suitable, since I don't think it was directly related to masking, but feel free to change back if I have misunderstood. – user438383 Aug 22 '21 at 16:26
  • Thank you @Kat. I will probably go with the across solution pointed by user438383 – guna Aug 22 '21 at 16:32
  • [dplyr - groupby on multiple columns using variable names](https://stackoverflow.com/questions/34487641/dplyr-groupby-on-multiple-columns-using-variable-names) – Henrik Aug 22 '21 at 20:08

4 Answers4

7

A simple way is to use the across() function from dplyr.

mpg %>% group_by(across(all_of(grpByCols)))
# A tibble: 234 × 11
# Groups:   model, manufacturer [38]
   manufacturer model      displ  year   cyl trans drv     cty   hwy fl    class
   <chr>        <chr>      <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
 1 audi         a4           1.8  1999     4 auto… f        18    29 p     comp…
 2 audi         a4           1.8  1999     4 manu… f        21    29 p     comp…
 3 audi         a4           2    2008     4 manu… f        20    31 p     comp…
 4 audi         a4           2    2008     4 auto… f        21    30 p     comp…
 5 audi         a4           2.8  1999     6 auto… f        16    26 p     comp…
 6 audi         a4           2.8  1999     6 manu… f        18    26 p     comp…
 7 audi         a4           3.1  2008     6 auto… f        18    27 p     comp…
 8 audi         a4 quattro   1.8  1999     4 manu… 4        18    26 p     comp…
 9 audi         a4 quattro   1.8  1999     4 auto… 4        16    25 p     comp…
10 audi         a4 quattro   2    2008     4 manu… 4        20    28 p     comp…
user438383
  • 5,716
  • 8
  • 28
  • 43
  • 1
    This is Brilliant! I need to dig more into across. Thank you!! – guna Aug 22 '21 at 16:28
  • 3
    This should really be `across(all_of(grpByCols))`. You don't see a warning about directly passing a character vector being ambiguous? – Lionel Henry Aug 23 '21 at 07:10
5

We could unquote the symbol with !!

grpByCols <- "model"
mpg%>%
    group_by(!!sym(grpByCols))
manufacturer model      displ  year   cyl trans      drv     cty   hwy fl    class  
   <chr>        <chr>      <dbl> <int> <int> <chr>      <chr> <int> <int> <chr> <chr>  
 1 audi         a4           1.8  1999     4 auto(l5)   f        18    29 p     compact
 2 audi         a4           1.8  1999     4 manual(m5) f        21    29 p     compact
 3 audi         a4           2    2008     4 manual(m6) f        20    31 p     compact
 4 audi         a4           2    2008     4 auto(av)   f        21    30 p     compact
 5 audi         a4           2.8  1999     6 auto(l5)   f        16    26 p     compact
 6 audi         a4           2.8  1999     6 manual(m5) f        18    26 p     compact
 7 audi         a4           3.1  2008     6 auto(av)   f        18    27 p     compact
 8 audi         a4 quattro   1.8  1999     4 manual(m5) 4        18    26 p     compact
 9 audi         a4 quattro   1.8  1999     4 auto(l5)   4        16    25 p     compact
10 audi         a4 quattro   2    2008     4 manual(m6) 4        20    28 p     compact
# ... with 224 more rows
TarJae
  • 72,363
  • 6
  • 19
  • 66
4

You can use the following solution. You should use rlang::syms which takes strings as input and turn them into symbols and since the output is a list of length 2 (corresponding to the length of input), we use big bang operator !!! to splice the elements of the list, meaning that they each become one single argument:

library(rlang)

grpByCols <- c("model", "manufacturer")

mpg %>%
  group_by(!!!syms(grpByCols))

# A tibble: 234 x 11
# Groups:   model, manufacturer [38]
   manufacturer model      displ  year   cyl trans      drv     cty   hwy fl    class  
   <chr>        <chr>      <dbl> <int> <int> <chr>      <chr> <int> <int> <chr> <chr>  
 1 audi         a4           1.8  1999     4 auto(l5)   f        18    29 p     compact
 2 audi         a4           1.8  1999     4 manual(m5) f        21    29 p     compact
 3 audi         a4           2    2008     4 manual(m6) f        20    31 p     compact
 4 audi         a4           2    2008     4 auto(av)   f        21    30 p     compact
 5 audi         a4           2.8  1999     6 auto(l5)   f        16    26 p     compact
 6 audi         a4           2.8  1999     6 manual(m5) f        18    26 p     compact
 7 audi         a4           3.1  2008     6 auto(av)   f        18    27 p     compact
 8 audi         a4 quattro   1.8  1999     4 manual(m5) 4        18    26 p     compact
 9 audi         a4 quattro   1.8  1999     4 auto(l5)   4        16    25 p     compact
10 audi         a4 quattro   2    2008     4 manual(m6) 4        20    28 p     compact
# ... with 224 more rows
Anoushiravan R
  • 21,622
  • 3
  • 18
  • 41
  • 1
    Thank you! I probably would use the across solution. Since the new tidyverse grammar seems to recommend away from !!! – guna Aug 22 '21 at 16:29
3

Using cur_data()

library(dplyr)
mpg %>% 
     group_by(cur_data()[grpByCols])

-output

# A tibble: 234 x 11
# Groups:   model, manufacturer [38]
   manufacturer model      displ  year   cyl trans      drv     cty   hwy fl    class  
   <chr>        <chr>      <dbl> <int> <int> <chr>      <chr> <int> <int> <chr> <chr>  
 1 audi         a4           1.8  1999     4 auto(l5)   f        18    29 p     compact
 2 audi         a4           1.8  1999     4 manual(m5) f        21    29 p     compact
 3 audi         a4           2    2008     4 manual(m6) f        20    31 p     compact
 4 audi         a4           2    2008     4 auto(av)   f        21    30 p     compact
 5 audi         a4           2.8  1999     6 auto(l5)   f        16    26 p     compact
 6 audi         a4           2.8  1999     6 manual(m5) f        18    26 p     compact
 7 audi         a4           3.1  2008     6 auto(av)   f        18    27 p     compact
 8 audi         a4 quattro   1.8  1999     4 manual(m5) 4        18    26 p     compact
 9 audi         a4 quattro   1.8  1999     4 auto(l5)   4        16    25 p     compact
10 audi         a4 quattro   2    2008     4 manual(m6) 4        20    28 p     compact
# … with 224 more rows
akrun
  • 874,273
  • 37
  • 540
  • 662