I'm using the code below to generate a simple summary table:
# Data
data("mtcars")
# Lib
require(dplyr)
# Summary
mt_sum <- mtcars %>%
group_by(am) %>%
summarise_each(funs(min, mean, median, max), mpg, cyl) %>%
mutate(am = as.character(am)) %>%
left_join(y = as.data.frame(table(mtcars$am),
stringsAsFactors = FALSE),
by = c("am" = "Var1"))
The code produces the desired results:
> head(mt_sum)
Source: local data frame [2 x 10]
am mpg_min cyl_min mpg_mean cyl_mean mpg_median cyl_median mpg_max cyl_max Freq
(chr) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (int)
1 0 10.4 4 17.14737 6.947368 17.3 8 24.4 8 19
2 1 15.0 4 24.39231 5.076923 22.8 4 33.9 8 13
However, I'm not satisfied with the way the columns are ordered. In particular, I would like to:
Order columns by name
Achieve that via
select()
indplyr
Desired order
The desired order would look like that:
> names(mt_sum)[order(names(mt_sum))]
[1] "am" "cyl_max" "cyl_mean" "cyl_median" "cyl_min" "Freq" "mpg_max"
[8] "mpg_mean" "mpg_median" "mpg_min"
Attempts
Ideally, I would like to pass names(mt_sum)[order(names(mt_sum))]
way of sorting the columns in select()
. But the code:
mt_sum <- mtcars %>%
group_by(am) %>%
summarise_each(funs(min, mean, median, max), mpg, cyl) %>%
mutate(am = as.character(am)) %>%
left_join(y = as.data.frame(table(mtcars$am),
stringsAsFactors = FALSE),
by = c("am" = "Var1")) %>%
select(names(.)[order(names(.))])
Will return the expected error:
Error: All select() inputs must resolve to integer column positions. The following do not: * names(.)[order(names(.))]
In my real data I'm generating a vast number of summary columns. Hence my question, how can I dynamically pass sorted column names to select()
in dplyr
so it will understand it and apply to the data.frame
at Hand?
My focus is on figuring out a way of passing the dynamically generated column names to select()
. I know that I could sort the columns in base
or by typing names, as discussed here.