Using dplyr
, you can just group_by
and summarize
:
food %>%
group_by(foodID) %>%
summarize(calories_average = mean(calories),
protein_average = mean(protein))
# A tibble: 3 x 3
foodID calories_average protein_average
<int> <dbl> <dbl>
1 123 0.41 0.7
2 432 0.65 0.7
3 983 0.82 0.6
Rather than specifying each variable, you can use summarize_at
to select multiple variables to summarize at once. We pass in 2 arguments: the variables to summarize, and a list of functions to apply to them. If the list is named, as it is here, then the name is added to the summary column as a suffix (giving "calores_average" and "protein_average":
food %>%
group_by(foodID) %>%
summarize_at(c('calories', 'protein'), list(average = mean))
summarize_at
also allows you to use various helper functions to select variables by prefix, suffix, or regex (as shown below). You can learn more about them here: ?tidyselect::select_helpers
food %>%
group_by(foodID) %>%
summarize_at(vars(matches('calories|protein')), list(average = mean))