dplyr: order columns alphabetically in R

Question

If I have a large DF (hundreds and hundreds) columns with different col_names randomly distributed alphabetically:

df.x <- data.frame(2:11, 1:10, rnorm(10))
colnames(df.x) <- c("ID", "string", "delta")

How would I order all of the data (vertically) by col_name alphabetically?

Essentially, I have hundreds of CSV(sep="|") text files that I need to read their columns into a single df, order those columns alphabetically and then use some other dplyf functions to get a final result. I have all of this figured out except how to order the columns alphabetically. I do not want to sort the columns (up and down) by alphabet, rather, the actual vertical orientation of the col_names and their corresponding data. Analogous to cutting and pasting entire columns of data in Excel.

For example I reviewed this approach but this is the "sort" the rows alphabetically bit, which is not what I'm looking to do.

How to sort a dataframe by column(s)?

Thanks!

Brilliant! I was wondering if a subset would be involved. Is that analogous to a copy/cut function in Excel? — Zach, Apr 26 '15 at 03:33

score 78 · Accepted Answer · edited Apr 13 '22 at 09:58

78

Try this

df %>% select(noquote(order(colnames(df))))

or just

df[,order(colnames(df))]

Update Dec 2021

New versions of dplyr (>= 1.0.7) work without the noquote:

df %>% select(order(colnames(df)))

edited Apr 13 '22 at 09:58

Rodrigo Zepeda

1,935
2
15
25

answered Apr 26 '15 at 03:26

Koundy

5,265
3
24
37

50

Or simply `df.x %>% select(order(colnames(.)))` – Steven Beaupré Apr 26 '15 at 04:51
Have recent versions of `dplyr` changed how this works? I know that there were changes to how quoted versus unquoted variable names were treated in some functions. – randy Jun 21 '21 at 22:23

score 54 · Answer 2 · edited Jun 30 '23 at 09:01

54

An alternative way to do this in dplyr is to use tidyselect::peek_vars

iris %>% 
  select(sort(tidyselect::peek_vars()))

current_vars() returns column names such that they're sortable, and select() will take the vector of column names.

edited Jun 30 '23 at 09:01

Julien

1,613
1
10
26

answered Oct 26 '17 at 09:13

Steph Locke

5,951
4
39
77

`current_vars()` but not `everything()` works for me (`dplyr` 0.7.6). I don't get the above error. Also noteworthy is that `iris %>% select(sort(current_vars()), -Species)` works, but not `iris %>% select(-Species, sort(current_vars()))`. – Joe Sep 28 '18 at 08:50
22

Update Dec 2019. The `current_vars()` has been deprecated in favor of `tidyselect::peek_vars()`. The above code works with this substitution. `select(sort(tidyselect::peek_vars()))` – John J. Dec 02 '19 at 02:22
2

You could also use the new `relocate()` verb. ```iris %>% relocate(sort(current_vars))``` – hnagaty Apr 08 '21 at 10:51
@Julien The [original answer stated `current_vars()`](https://stackoverflow.com/posts/46950262/revisions). Now you've basically edited this answer to contain the answer of me below. How does that make sense, improving answers by replacing them with existing answers?? I don't understand how that got approved. – MS Berends Jul 02 '23 at 14:20
It allows to see a correct answer directly, instead of looking in the comments – Julien Jul 02 '23 at 18:18

score 18 · Answer 3 · answered May 23 '20 at 19:46

18

If a specific column (or columns) has to be the first one (or last), but the rest is ordered, you can:

mtcars %>% tibble %>% 
  select("hp", sort(colnames(.)))

answered May 23 '20 at 19:46

HBat

4,873
4
39
56

score 10 · Answer 4 · answered Mar 17 '22 at 16:17

For a partial sorting using the recent tidyverse (more specific, the tidyselect package), use peek_vars():

library(dplyr)

starwars
#> # A tibble: 87 x 14
#>    name               height  mass hair_color    skin_color  eye_color birth_year
#>    <chr>               <int> <dbl> <chr>         <chr>       <chr>          <dbl>
#>  1 Luke Skywalker        172    77 blond         fair        blue            19.0
#>  2 C-3PO                 167    75 NA            gold        yellow         112.0
#>  3 R2-D2                  96    32 NA            white, blue red             33.0
#>  4 Darth Vader           202   136 none          white       yellow          41.9
#>  5 Leia Organa           150    49 brown         light       brown           19.0
#>  6 Owen Lars             178   120 brown, grey   light       blue            52.0
#>  7 Beru Whitesun lars    165    75 brown         light       blue            47.0
#>  8 R5-D4                  97    32 NA            white, red  red               NA
#>  9 Biggs Darklighter     183    84 black         light       brown           24.0
#> 10 Obi-Wan Kenobi        182    77 auburn, white fair        blue-gray       57.0
#> # ... with 77 more rows, and 7 more variables: sex <chr>, gender <chr>,
#> #   homeworld <chr>, species <chr>, films <list>, vehicles <list>, starships <list>

starwars %>% select(name, mass, sort(tidyselect::peek_vars()))
#> # A tibble: 87 x 14
#>    name                mass birth_year eye_color films gender    hair_color    height
#>    <chr>              <dbl>      <dbl> <chr>     <lis> <chr>     <chr>          <int>
#>  1 Luke Skywalker        77       19.0 blue      <chr> masculine blond            172
#>  2 C-3PO                 75      112.0 yellow    <chr> masculine NA               167
#>  3 R2-D2                 32       33.0 red       <chr> masculine NA                96
#>  4 Darth Vader          136       41.9 yellow    <chr> masculine none             202
#>  5 Leia Organa           49       19.0 brown     <chr> feminine  brown            150
#>  6 Owen Lars            120       52.0 blue      <chr> masculine brown, grey      178
#>  7 Beru Whitesun lars    75       47.0 blue      <chr> feminine  brown            165
#>  8 R5-D4                 32         NA red       <chr> masculine NA                97
#>  9 Biggs Darklighter     84       24.0 brown     <chr> masculine black            183
#> 10 Obi-Wan Kenobi        77       57.0 blue-gray <chr> masculine auburn, white    182
#> # ... with 77 more rows, and 6 more variables: homeworld <chr>, sex <chr>,
#> #   skin_color <chr>, species <chr>, starships <list>, vehicles <list>

score 2 · Answer 5 · answered Jun 30 '23 at 09:15

2

An option with starts_with():

library(dplyr)
df.x |> 
  select(starts_with(LETTERS))

answered Jun 30 '23 at 09:15

Julian

6,586
2
9
33

score 1 · Answer 6 · answered Aug 10 '22 at 09:46

1

Using dplyr:

df.x %>% 
  select(sort(names(.)))

answered Aug 10 '22 at 09:46

Benoit Lamarsaude

375
3
11

score 1 · Answer 7 · answered Jan 31 '23 at 00:04

1

Since no answers yet mention relocate():

df |>
  dplyr::relocate(sort(names(df)))

https://dplyr.tidyverse.org/reference/relocate.html

answered Jan 31 '23 at 00:04

MichaelChirico

33,841
14
113
198

score 0 · Answer 8 · answered May 06 '22 at 21:14

We can define :

abc <- function(..., desc = FALSE) {
  data <- tidyselect::peek_data()
  named_selection <- tidyselect::eval_select(rlang::expr(c(...)), data)
  named_selection[order(names(named_selection), named_selection, decreasing = desc)]
}

Unlike other solutions you can call it around any tidy selection, but in our case it can be only:

df %>% select(abc(everything()))

score -1 · Answer 9 · answered Sep 19 '19 at 01:21

-1

Why not just:

sort(colnames(df.x))

[1] "delta"  "ID"     "string"

answered Sep 19 '19 at 01:21

Frank B.

1,813
5
24
44

dplyr: order columns alphabetically in R

9 Answers9

Linked

Related