I want to create a data frame with columns for the proportion of observations in each category, much like this:
library(tidyverse)
mtcars %>%
group_by(am) %>%
summarise(gear3 = sum(gear == 3)/n(),
gear4 = sum(gear == 4)/n(),
gear5 = sum(gear == 5)/n(),
cyl4 = sum(cyl == 4)/n(),
cyl6 = sum(cyl == 6)/n(),
cyl8 = sum(cyl == 8)/n())
# # A tibble: 2 x 7
# am gear3 gear4 gear5 cyl4 cyl6 cyl8
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 0 0.789 0.211 0 0.158 0.211 0.632
# 2 1 0 0.615 0.385 0.615 0.231 0.154
I am looking for way to this without manually naming the new summary variables?
There seems to be a few questions, such as here, related to creating a proportions for single variables, which i could replicate for each variable, pivot and and then combine but it will become tedious in my application - i am trying to build the data frame for many variables
mtcars %>%
group_by(am, gear) %>%
summarise(n = n()) %>%
mutate(freq = n / sum(n))
# # A tibble: 4 x 4
# # Groups: am [2]
# am gear n freq
# <dbl> <dbl> <int> <dbl>
# 1 0 3 15 0.789
# 2 0 4 4 0.211
# 3 1 4 8 0.615
# 4 1 5 5 0.385
mtcars %>%
group_by(am, cyl) %>%
summarise(n = n()) %>%
mutate(freq = n / sum(n))
# # A tibble: 6 x 4
# # Groups: am [2]
# am cyl n freq
# <dbl> <dbl> <int> <dbl>
# 1 0 4 3 0.158
# 2 0 6 4 0.211
# 3 0 8 12 0.632
# 4 1 4 8 0.615
# 5 1 6 3 0.231
# 6 1 8 2 0.154