This is not a duplicate of questions like e.g. Row-wise iteration like apply with purrr
I understand how to use pmap()
to do a row-wise operation on a data-frame:
library(tidyverse)
df1 = tribble(~col_1, ~col_2, ~col_3,
1, 5, 12,
9, 3, 3,
6, 10, 7)
foo = function(col_1, col_2, col_3) {
mean(c(col_1, col_2, col_3))
}
df1 %>% pmap_dbl(foo)
This gives the function foo
applied to every row:
[1] 6.000000 5.000000 7.666667
But this gets pretty unwieldy when I have more than a few columns, because I have to pass them all in explicitly. What if I had say, 8 columns in my dataframe df2
and I want to apply a function bar
that potentially involves every single one of those columns?
set.seed(12345)
df2 = rnorm(n=24) %>% matrix(nrow=3) %>% as_tibble() %>%
setNames(c("col_1", "col_2", "col_3", "col_4", "col_5", "col_6", "col_7", "col_8"))
bar = function(col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8) {
# imagine we do some complicated row-wise operation here
mean(c(col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8))
}
df2 %>% pmap_dbl(bar)
Gives:
[1] 0.45085420 0.02639697 -0.28121651
This is clearly inadequate -- I have to add a new argument to bar
for every single column. It's a lot of typing, and it makes the code less readable and more fragile. It seems like there should be a way to have it take a single argument x
, and then access the variables I want by x$col_1
etc. Or something more elegant than the above at any rate. Is there any way to clean this code up using purrr?