1

I have a series of aggregation functions that I want to apply to multiple columns in a data frame (using purrr::map). Some are standard functions, like mean, and others are user-defined functions.

I want the output to be a data frame with the names of those functions as the columns that store their output.

Here's an example that shows the procedure and the desired output:

library(purrr)

foo <- function(x) sd(x) * .1 # arbitrary user-defined function
agg_funcs <- c(mean, median, foo)
names(agg_funcs) <- c("mean", "median", "foo")

fields <- c("mpg", "disp") # fields to aggregate with agg_funcs

fields %>% 
  set_names(fields) %>% 
  map_df(function(x) map_df(agg_funcs, function(f) f(mtcars[[x]])), .id = "field")

# A tibble: 2 x 4
  field  mean median    foo
  <chr> <dbl>  <dbl>  <dbl>
1 mpg    20.1   19.2  0.603
2 disp  231.   196.  12.4  

But I don't want to name the functions in agg_funcs by hand. Instead, I'd like to do something like

agg_funcs %>% set_names(agg_funcs) # doesn't work

I know that I can get the string representation of a function name like this:

as.character(substitute(mean))
# "mean"

And I can even wrap this operation in a function and pass in the agg funcs I want, one by one:

f <- function(func) as.character(substitute(func))
f(mean) # "mean"

This function f() is the sort of thing I'd want in my nested map operation above, to get the function names as strings.
But this fails when I try to map or sapply over agg_funcs (for reasons I don't totally understand):

sapply(agg_funcs, f) # alt: map(agg_funcs, f)

     mean median foo 
[1,] "[[" "[["   "[["
[2,] "X"  "X"    "X" 
[3,] "i"  "i"    "i" 

How can I achieve the result I'm looking for (names of functions as columns in mapped output df) without creating names(agg_funcs) by hand?

andrew_reece
  • 20,390
  • 3
  • 33
  • 58
  • Wouldn't this be easier with `dplyr::lst(mean, median, foo)` – akrun Jan 01 '21 at 22:23
  • 1
    @akrun yes, that solves my problem - i didn't know about `lst`, thanks. Looks like it is imported into `dplyr` from `tibble`, and is currently in the "questioning" phase. But apparently just a copy of `rlang::list2()`. Great to know about this, many thanks. Can you make it into an answer? – andrew_reece Jan 01 '21 at 22:27

1 Answers1

2

We can either use lst from dplyr or purrr, which returns a named list of the same arguments passed

dplyr::lst(mean, median, foo)
akrun
  • 874,273
  • 37
  • 540
  • 662