0

I would like to use dplyr::arrange to sort the rows of a data frame based on values in specific columns. I want to choose the columns based on position rather than column name, as the column names will vary based on the input. I have tried to adapt the suggestions for dplyr::select found here (dplyr: select columns by position in NSE), but my code just returns the original data frame with no changes. Here is my data frame and the code I've used to sort it:

df <- structure(list(D7_ctrl_v_D6_ctrl_deg = structure(c(1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("down", "unchanged", 
"up"), class = "factor"), D7_OE_v_D7_ctrl_deg = structure(c(1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 
3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), .Label = c("down", "unchanged", 
"up"), class = "factor"), D7_OE_v_D6_ctrl_deg = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("down", "unchanged", 
"up"), class = "factor"), Freq = c(584L, 1841L, 375L, 636L, 331L, 
0L, 44L, 0L, 0L, 0L, 420L, 600L, 208L, 9164L, 280L, 391L, 410L, 
0L, 0L, 0L, 69L, 0L, 448L, 746L, 297L, 2362L, 715L)), class = "data.frame", row.names = c(NA, 
-27L))

## these all fail
df %>% dplyr::arrange(1)
df %>% dplyr::arrange(c(1))
df %>% dplyr::arrange(!!"1")
df %>% dplyr::arrange(!!c(1))

I'm guessing the difference has something to do with the difference between data-masking used by arrange and tidy-select used by select, but I can't figure out if there is a way to pick columns by position in arrange. Any suggestions would be appreciated. Thanks.

Josh
  • 1,210
  • 12
  • 30

2 Answers2

2

With dplyr >v1.0 you can use across() to specify columns by index. You can do

df %>% dplyr::arrange(across(1))
MrFlick
  • 195,160
  • 17
  • 277
  • 295
1

You have a few options, you can use the scoped functions like arrange_at which will allow you to specify a numeric vector of positions:

library(dplyr)
#> Warning: package 'dplyr' was built under R version 3.6.3
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

df <- iris
num_vec <- c(3,1)
df %>% arrange_at(1) %>% head()
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          4.3         3.0          1.1         0.1  setosa
#> 2          4.4         2.9          1.4         0.2  setosa
#> 3          4.4         3.0          1.3         0.2  setosa
#> 4          4.4         3.2          1.3         0.2  setosa
#> 5          4.5         2.3          1.3         0.3  setosa
#> 6          4.6         3.1          1.5         0.2  setosa
df %>% arrange_at(num_vec) %>% head()
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          4.6         3.6          1.0         0.2  setosa
#> 2          4.3         3.0          1.1         0.1  setosa
#> 3          5.0         3.2          1.2         0.2  setosa
#> 4          5.8         4.0          1.2         0.2  setosa
#> 5          4.4         3.0          1.3         0.2  setosa
#> 6          4.4         3.2          1.3         0.2  setosa

or you can use the new function that supersedes the scoped variants across

df %>% arrange(across(1)) %>% head()
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          4.3         3.0          1.1         0.1  setosa
#> 2          4.4         2.9          1.4         0.2  setosa
#> 3          4.4         3.0          1.3         0.2  setosa
#> 4          4.4         3.2          1.3         0.2  setosa
#> 5          4.5         2.3          1.3         0.3  setosa
#> 6          4.6         3.1          1.5         0.2  setosa
df %>% arrange(across(all_of(num_vec))) %>% head()
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          4.6         3.6          1.0         0.2  setosa
#> 2          4.3         3.0          1.1         0.1  setosa
#> 3          5.0         3.2          1.2         0.2  setosa
#> 4          5.8         4.0          1.2         0.2  setosa
#> 5          4.4         3.0          1.3         0.2  setosa
#> 6          4.4         3.2          1.3         0.2  setosa

Created on 2021-05-25 by the reprex package (v2.0.0)

With this you can use the tidyselect functions to avoid using NSE.

Justin Landis
  • 1,981
  • 7
  • 9