3

On a fairly regular basis I want to pass in strings that function as arguments in code. For context, I often want a section where I can pass in filtering criteria or assumptions that then flow through my analysis, plots, etc. to make it more interactive.

A simple example is below. I've seen the eval/parse solution, but it seems like that makes code chunks unreadable. Is there a better/cleaner/shorter way to do this?

column.names <- c("group1", "group2") #two column names I want to be able to toggle between for grouping
select.column <- group.options[1] #Select the column for grouping

DataTable.summary <- 
  DataTable %>% 
  group_by(select.column) %>% #How do I pass that selection in here? 
  summarize(avg.price = mean(SALES.PRICE))
xhr489
  • 1,957
  • 13
  • 39
  • 1
    First, you need to make this reproducible: `Error in eval(lhs, parent, parent) : object 'city.filtered' not found`. It's also unclear what you mean by "group1" being some sort of selection. Generally a selection criterion will be a logical result. – IRTFM Feb 24 '19 at 20:05
  • It is not so simple creating function with dplyr. You need to read about these things: tidyeval, rlang, !! operator, enquo...But it is a good question. Try updating your tags with some of the things I have written. – xhr489 Feb 24 '19 at 20:05
  • This link explains everything [link](https://dplyr.tidyverse.org/articles/programming.html#quoting). Go to the section with Quoting to see an example. – xhr489 Feb 24 '19 at 20:18
  • 1
    Also this question might be helpful: https://stackoverflow.com/questions/50011988/supplying-multiple-groups-of-variables-to-a-function-for-dplyr-arguments-in-the – Mikko Feb 24 '19 at 21:39
  • This might be useful too https://stackoverflow.com/a/49470372/786542 – Tung Feb 24 '19 at 23:45

3 Answers3

4

Well this is just a copy-paste from the tidyverse website: link:(https://dplyr.tidyverse.org/articles/programming.html#programming-recipes).

my_summarise <- function(df, group_var) {
  group_var <- enquo(group_var)
  print(group_var)
  df %>%
    group_by(!! group_var) %>%
    summarise(a = mean(a))
}
my_summarise(df, g1)
#> <quosure>
#> expr: ^g1
#> env:  global
#> # A tibble: 2 x 2
#>      g1     a
#>   <dbl> <dbl>
#> 1     1  2.5 
#> 2     2  3.33

But I think i illustrates your problem. I think what you really want to do is like the code above, i.e. create a function.

xhr489
  • 1,957
  • 13
  • 39
4

You can use the group_by_ function for the example in your question:

library(dplyr)

x <- data.frame(group1 = letters[1:4], group2 = LETTERS[1:4], value = 1:4)
select.colums <- c("group1", "group2")

x %>% group_by_(select.colums[2]) %>% summarize(avg = mean(value))
# A tibble: 4 x 2
#  group2   avg
#  <fct>  <dbl>
# 1 A          1
# 2 B          2
# 3 C          3
# 4 D          4

The *_ family functions in dplyr might also offer a more general solution you are after, although the dplyr documentation says they are deprecated (?group_by_) and might disappear at some point. An analogous expression to the above solution using the tidy evaluation syntax seems to be:

x %>% group_by(!!sym(select.colums[2])) %>% summarize(avg = mean(value))

And for several columns:

x %>% group_by(!!!syms(select.colums)) %>% summarize(avg = mean(value))

This creates a symbol out of a string that is evaluated by dplyr.

Mikko
  • 7,530
  • 8
  • 55
  • 92
2

I recommend using group_by_at(). It supports both single strings or character vectors:

nms <- c("cyl", "am")

mtcars %>% group_by_at(nms)
Lionel Henry
  • 6,652
  • 27
  • 33