The functionality of dplyr
to calculate descriptive statistics is great and it's really useful with all its flexibility.
I would like to know if it's possible to automatically change the order of the calculations, because now it applies each functions to all selected variables and then advances with the next function. Thereby, the output lists the minima for all variables, then the 25% quantile and so on. I would like to know if it's possible to display all descriptive stats for each variable continously.
library(data.table)
library(dplyr)
mtcars %>%
select(mpg, cyl, gear) %>%
group_by(gear) %>%
summarise_all(.tbl = ., funs(min = min(.),
q25 = quantile(., 0.25),
median = median(.),
q75 = quantile(., 0.75),
max = max(.),
mean = mean(.),
sd = sd(.)), na.rm = TRUE) %>%
data.table(.)
# Output now
gear mpg_min cyl_min mpg_q25 cyl_q25 mpg_median cyl_median mpg_q75 cyl_q75 mpg_max cyl_max mpg_mean cyl_mean mpg_sd cyl_sd
1: 3 10.4 4 14.5 8 15.5 8 18.400 8 21.5 8 16.10667 7.466667 3.371618 1.1872337
2: 4 17.8 4 21.0 4 22.8 4 28.075 6 33.9 6 24.53333 4.666667 5.276764 0.9847319
3: 5 15.0 4 15.8 4 19.7 6 26.000 8 30.4 8 21.38000 6.000000 6.658979 2.0000000
# Desired Output - Excerpt
gear mpg_min mpg_q25 mpg_median mpg_q75 mpg_max mpg_mean mpg_sd cyl_min cyl_q25
1: 3 10.4 14.5 15.5 18.400 21.5 16.10667 3.371618 4 8
2: 4 17.8 21.0 22.8 28.075 33.9 24.53333 5.276764 4 4
3: 5 15.0 15.8 19.7 26.000 30.4 21.38000 6.658979 4 4