1

The functionality of dplyr to calculate descriptive statistics is great and it's really useful with all its flexibility. I would like to know if it's possible to automatically change the order of the calculations, because now it applies each functions to all selected variables and then advances with the next function. Thereby, the output lists the minima for all variables, then the 25% quantile and so on. I would like to know if it's possible to display all descriptive stats for each variable continously.

library(data.table)
library(dplyr)
mtcars %>% 
  select(mpg, cyl, gear) %>% 
  group_by(gear) %>%
  summarise_all(.tbl = ., funs(min = min(.), 
                               q25 = quantile(., 0.25), 
                               median = median(.), 
                               q75 = quantile(., 0.75), 
                               max = max(.), 
                               mean = mean(.), 
                               sd = sd(.)), na.rm = TRUE) %>% 
  data.table(.)
# Output now
   gear mpg_min cyl_min mpg_q25 cyl_q25 mpg_median cyl_median mpg_q75 cyl_q75 mpg_max cyl_max mpg_mean cyl_mean   mpg_sd    cyl_sd
1:    3    10.4       4    14.5       8       15.5          8  18.400       8    21.5       8 16.10667 7.466667 3.371618 1.1872337
2:    4    17.8       4    21.0       4       22.8          4  28.075       6    33.9       6 24.53333 4.666667 5.276764 0.9847319
3:    5    15.0       4    15.8       4       19.7          6  26.000       8    30.4       8 21.38000 6.000000 6.658979 2.0000000

  # Desired Output - Excerpt
   gear mpg_min mpg_q25 mpg_median mpg_q75 mpg_max mpg_mean   mpg_sd cyl_min cyl_q25
1:    3    10.4    14.5       15.5  18.400    21.5 16.10667 3.371618       4       8
2:    4    17.8    21.0       22.8  28.075    33.9 24.53333 5.276764       4       4
3:    5    15.0    15.8       19.7  26.000    30.4 21.38000 6.658979       4       4
Frank
  • 66,179
  • 8
  • 96
  • 180
hannes101
  • 2,410
  • 1
  • 17
  • 40

1 Answers1

0

Ok, it's possible with some small tweaks, but I think this is pretty nice. I make the resulting names suffixes of the functions alphabetical and then sort all columns except the grouping column.

mtcars %>% select(mpg, cyl, gear) %>% group_by(gear) %>%summarise_all(.tbl = ., funs(a_min = min(.)
                                                                                       , b_q25 = quantile(., 0.25) 
                                                                                       , c_median = median(.)
                                                                                       , d_q75 = quantile(., 0.75)
                                                                                       , e_max = max(.)
                                                                                       , f_mean = mean(.) 
                                                                                       , g_sd = sd(.)
), na.rm = TRUE) %>% select(gear, order(names(.)[-1])) 
hannes101
  • 2,410
  • 1
  • 17
  • 40