2

I am trying to create a function that passes a list of column names to a dplyr function. I know how to do this if the list of columns names is given in the ... form, as explained in the tidyeval documentation:

df <- tibble(
  g1 = c(1, 1, 2, 2, 2),
  g2 = c(1, 2, 1, 2, 1),
  a = sample(5), 
  b = sample(5)
)

my_summarise <- function(df, ...) {
  group_var <- quos(...)

  df %>%
    group_by(!!!group_var) %>%
    summarise(a = mean(a))
}

my_summarise(df, g1, g2)

But if I want to list the column names as an argument of the function, the above solution will not work (of course):

my_summarise <- function(df, group_var, sum_var) {
  group_var <- quos(group_var) # nor enquo(group_var)
  sum_var <- enquo(sum_var)

  df %>%
    group_by(!!!group_var) %>%
    summarise(a = mean(a))
}

my_summarise(df, list(g1, g2), a)
my_summarise(df, list(g1, g2), b)

How can I get the items inside the list to be quoted individually?

This question is similar to Passing dataframe column names in a function inside another function but in the comments it was suggested to use strings, while here I would like to use bare column names.

Sotos
  • 51,121
  • 6
  • 32
  • 66
Stefano
  • 1,405
  • 11
  • 21
  • Maybe here is another question connected with your problem https://stackoverflow.com/questions/44166247/referring-to-individual-variables-in-with-dplyr-quos – Scipione Sarlo Dec 27 '17 at 14:36
  • 1
    How about passing your "group_var" argument as a quosure via `quos` instead with `list`, as shown in the first part of [this answer](https://stackoverflow.com/a/44593617/2461552)? – aosmith Dec 27 '17 at 15:19

2 Answers2

4
library(dplyr)

df <- tibble(
  g1 = c(1, 1, 2, 2, 2),
  g2 = c(1, 2, 1, 2, 1),
  a = sample(5), 
  b = sample(5)
)

my_summarise = function(df, group_var, fun_name) {

  df %>%
    group_by(!!! group_var) %>%
    summarize_all(fun_name)
}

my_summarise(df, alist(g1, g2), mean)

alist() handles the arguments 'g1' and 'g2' as function arguments (does not evaluate them) while !!! (same as UQS() unquotes and splices the list. sum_var is not necessary as it looks like you want to take the mean of both 'a' and 'b'. Also, you can generalize it by passing in the function as well.

AlphaDrivers
  • 136
  • 4
  • 2
    This is the right answer. You want the quoting to be external and explicitly done by the user rather than implicitly by your function. To this end you can ask your users to quote with `base::alist()`, `rlang::exprs()`, or `dplyr::vars()`. – Lionel Henry Dec 28 '17 at 14:32
1

You could pass your list of arguments using alist instead of list, as it won't evaluate the arguments.

my_summarise = function(df, group_var, sum_var) {
    group_var = quos(!!! group_var)
    sum_var = enquo(sum_var)

    df %>%
        group_by(!!! group_var) %>%
        summarise(!! quo_name( sum_var) := mean( !! sum_var) )
}

my_summarise(df, alist(g1, g2), b)

# A tibble: 4 x 3
# Groups:   g1 [?]
     g1    g2     b
  <dbl> <dbl> <dbl>
1     1     1   2.0
2     1     2   3.0
3     2     1   4.5
4     2     2   1.0

Another alternative would be to pass that argument directly with quos instead of list as shown in this answer, which bypasses some complications all together.

my_summarise = function(df, group_var, sum_var) {
    # group_var = quos(!!! group_var)
    sum_var = enquo(sum_var)

    df %>%
        group_by(!!! group_var) %>%
        summarise(!! quo_name( sum_var) := mean( !! sum_var) )
}

my_summarise(df, quos(g1, g2), b)

# A tibble: 4 x 3
# Groups:   g1 [?]
     g1    g2     b
  <dbl> <dbl> <dbl>
1     1     1   2.0
2     1     2   3.0
3     2     1   4.5
4     2     2   1.0
aosmith
  • 34,856
  • 9
  • 84
  • 118