Suppose we start with the following:
library(dplyr)
library(magrittr)
library(tibble)
set.seed(123)
tbl <- data_frame(value=rnorm(100), class=rep(LETTERS[1:5], each=20))
I'd like to write a function summarize_means(data, values, groupby)
which, given tbl
, "value"
, and "class"
, returns the same output as the following code:
tbl %>%
group_by(class) %>%
summarise(mean(value))
My first attempt was:
summarise_means <- function(data, values, groupby) {
data %>%
group_by(groupby) %>%
summarise(mean(values))
}
Which, of course, failed with
Error: unknown variable to group by : groupby
After a bit of digging, I determined that I ought to be using the group_by_
and summarize_
functions, but I suspect that I am using them incorrectly here as this still doesn't work:
summarise_means <- function(data, values, groupby) {
data %>%
group_by_(groupby) %>%
summarise_(mean(values))
}
When I call summarise_means(tbl, 'value', 'class')
, I get:
# A tibble: 5 x 2
class NA_real_
<chr> <dbl>
1 A NA
2 B NA
3 C NA
4 D NA
5 E NA
Warning message:
In mean.default(values) : argument is not numeric or logical: returning NA
I don't really understand what's going wrong here. Any help is greatly appreciated!