Extending this answer given by @G. Grothendieck, how can I pass more than one grouping variable to dplyr inside a function?
Let's say I have this data:
# Data
set.seed(1)
dfx <- data.frame(nLive = sample(x = 10, size = 40, replace = TRUE),
nDead = sample(x = 3, size = 40, replace = TRUE),
areaA = c(rep("A", 20), rep("B", 20)),
areaB = rep( c( rep("yes", 10), rep("no", 10)), 2),
year = rep(c(2000,2002,2004,2006,2008),4)
)
I want to group by year, and possibly up to 2 other variables.
G. Grothendieck's example works perfectly for specifying 1 index:
UnFun <- function(dat, index) {
dat %>%
group_by(year) %>%
regroup(list(index)) %>%
summarise(n = n() )
}
> UnFun(dfx, "areaA")
Source: local data frame [2 x 2]
areaA n
1 A 20
2 B 20
> UnFun(dfx, "areaB")
Source: local data frame [2 x 2]
areaB n
1 no 20
2 yes 20
But when I try to group by both (or year alone), I get errors or wrong answers:
> UnFun(dfx, list("areaA", "areaB"))
Error: cannot convert to symbol (SYMSXP)
> UnFun(dfx, c("areaA", "areaB"))
Source: local data frame [2 x 2]
areaA n
1 A 20
2 B 20
UnFun(dfx, NULL)
Error: cannot convert to symbol (SYMSXP)
Any tips about how to to correctly specify the option of 0, 1 or 2 groups?
Thanks, R Community!