0

I want to pass multiple columns to one UDF argument in the tidy way (so as bare column names).

Example: I have a simple function which takes a column of the mtcars dataset as an input and uses that as the grouping variable to do an easy count operation with summarise.

library(tidyverse)

test_function <- function(grps){
  grps <- enquo(grps) 
  mtcars %>% 
    group_by(!!grps) %>% 
    summarise(Count = n())
}

Result if I execute the function with "cyl" as the grouping variable:

test_function(grps = cyl)

-----------------

    cyl Count
  <dbl> <int>
1     4    11
2     6     7
3     8    14

Now imagine I want to pass multiple columns to the argument "grps" so that the dataset is grouped by more columns. Here is what I imagine some example function executions could look like:

test_function(grps = c(cyl, gear))
test_function(grps = list(cyl, gear))

Here is what the expected result would look like:

    cyl  gear Count
  <dbl> <dbl> <int>
1     4     3     1
2     4     4     8
3     4     5     2
4     6     3     2
5     6     4     4
6     6     5     1
7     8     3    12
8     8     5     2

Is there a way to pass multiple bare columns to one argument of a UDF? I know about the "..." operator already but since I have in reality 2 arguments where I want to possibly pass more than one bare column as an argument the "..." is not feasible.

mafiale
  • 41
  • 4

1 Answers1

1

You can use the across() function with embraced arguments for this which works for most dplyr verbs. It will accept bare names or character strings:

test_function <- function(grps){
  mtcars %>% 
    group_by(across({{ grps }})) %>% 
    summarise(Count = n())
}

test_function(grps = c(cyl, gear))

`summarise()` has grouped output by 'cyl'. You can override using the `.groups` argument.
# A tibble: 8 x 3
# Groups:   cyl [3]
    cyl  gear Count
  <dbl> <dbl> <int>
1     4     3     1
2     4     4     8
3     4     5     2
4     6     3     2
5     6     4     4
6     6     5     1
7     8     3    12
8     8     5     2

test_function(grps = c("cyl", "gear"))

# Same output
Ritchie Sacramento
  • 29,890
  • 4
  • 48
  • 56