With dplyr and R, you can use group_by and summarize to aggregate data.
For instance:
mpg_cyl_carb <- mtcars %>%
group_by(cyl, carb) %>%
summarise(var1 = mean(mpg))
head(mpg_cyl_carb, 3)
A tibble: 3 x 3
Groups: cyl [2]
cyl carb var1
<dbl> <dbl> <dbl>
1 4 1 27.6
2 4 2 25.9
3 6 1 19.8
It means that when cyl = 4 and carb = 4, the mean for mpg is 27.6. When cyl = 6 and carb = 1, the mean is 19.8, and so on.
I would like to nest those aggregate results on the original dataframe. Currently am joining two tables to do this:
> mtcars %>%
+ left_join(mpg_cyl_carb, by = c("cyl", "carb")) %>%
+ head(3) %>%
+ select(mpg, cyl, carb, var1)
mpg cyl carb var1
1 21.0 6 4 19.75
2 21.0 6 4 19.75
3 22.8 4 1 27.58
But is there an easier way? A single command for mutate, like:
> mtcars %>%
+ mutate(. . .)
Not a solution using if_else, as it would add complexity.