4

I would like to use invoke_map to call a list of functions. I have a set of variable names that I would like to use as arguments to each of the functions. Ultimately the variable names will used with group_by.

Here's an example:

library(dplyr)
library(purrr)
first_fun <- function(...){
  by_group = quos(...)
  mtcars %>%
    group_by(!!!by_group) %>%
    count()
}

second_fun <- function(...){
  by_group = quos(...)
  mtcars %>%
    group_by(!!!by_group) %>%
    summarise(avg_wt = mean(wt))
}

first_fun(mpg, cyl) # works
second_fun(mpg, cyl) # works

both_funs <- list(first_fun, second_fun)

both_funs %>%
  invoke_map(mpg, cyl) # What do I do here?

I have tried various attempts to put the variable names in quotes, enquo them, use vars, reference .data$mpg, etc, but I am stabbing in the dark a bit.

moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
Ryan Knight
  • 1,288
  • 1
  • 11
  • 18
  • [This question](https://stackoverflow.com/questions/43415475/how-to-parametrize-function-calls-in-dplyr-0-7/43416705#43416705) shows how to address the problem by changing the functions to get grouping columns from a named parameter instead of dots, which is a partial solution. I'm still interested in the question of how you pass variable names through dots using invoke, though perhaps the answer is don't get variable names from dots. – Ryan Knight Sep 19 '17 at 14:51
  • You'd think the `.env` argument could be useful here, but `invoke_map(first_fun, list(list(mpg, cyl)), .env = as.environment(mtcars))` doesn't work either. – Axeman Sep 22 '17 at 12:20

1 Answers1

2

The issue is not that you're using dots, it's that you're using names and when map2_impl is called these arguments are evaluated.

Try this and explore the environment:

debugonce(map2)
both_funs %>% invoke_map("mpg", "cyl")

This works on the other hand:

first_fun2 <- function(...){
  mtcars %>%
  {do.call(group_by_,list(.,unlist(list(...))))} %>%
    count()
}

second_fun2 <- function(...){
  mtcars %>%
  {do.call(group_by_,list(.,unlist(list(...))))} %>%
    summarise(avg_wt = mean(wt))
}

both_funs2 <- list(first_fun2, second_fun2)
both_funs2 %>% invoke_map("mpg", "cyl") 

# [[1]]
# # A tibble: 25 x 2
# # Groups:   mpg [25]
# mpg     n
# <dbl> <int>
#   1  10.4     2
# 2  13.3     1
# 3  14.3     1
# 4  14.7     1
# 5  15.0     1
# 6  15.2     2
# 7  15.5     1
# 8  15.8     1
# 9  16.4     1
# 10  17.3     1
# # ... with 15 more rows
# 
# [[2]]
# # A tibble: 25 x 2
# mpg avg_wt
# <dbl>  <dbl>
#   1  10.4 5.3370
# 2  13.3 3.8400
# 3  14.3 3.5700
# 4  14.7 5.3450
# 5  15.0 3.5700
# 6  15.2 3.6075
# 7  15.5 3.5200
# 8  15.8 3.1700
# 9  16.4 4.0700
# 10  17.3 3.7300
# # ... with 15 more rows
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
  • If the issue is that the names are being evaluated, shouldn't I be able to use one of the `enquo` family of functions to delay evaluation? – Ryan Knight Sep 22 '17 at 16:31
  • But you don't just want to delay evaluation, you want to not evaluate at all. The line that crashes is `.Call(map2_impl, environment(), ".x", ".y", ".f", "list")` . As far as I understand it calls C code and will evaluate the parameters enumerated a strings, I see no room for your named parameters to pass through this. – moodymudskipper Sep 22 '17 at 22:02
  • I see also a conceptual problem, these named variables don't refer to existing objects or columns from a current object, they should really be strings in this example. – moodymudskipper Sep 22 '17 at 22:04
  • This works: `first_fun3 <- function(vars){ mtcars %>% group_by_at(.vars = vars) %>% count()} ` then calling `invoke_map` with `vars=(mpg, cyl)` or would if I could get the formatting right in a comment. So perhaps the intended approach is to use `group_by_at` instead of `group_by`? – Ryan Knight Sep 26 '17 at 21:16