4

For this sample data.frame,

df <- data.frame(var1=c("b","a","b","a","a","b"),
                 var2=c("l","l","k","k","l","k"),
                 var3=c("t","t","x","t","x","x"),
                 var4=c(5,3,3,5,5,3),
                 stringsAsFactors=F)

Unsorted

  var1 var2 var3 var4
1    b    l    t    5
2    a    l    t    3
3    b    k    x    3
4    a    k    t    5
5    a    l    x    5
6    b    k    x    3

I would like to sort on three columns 'var2', 'var3' and 'var4' in this order simultaneously. One column ascending and another two descending. Column names to sort are stored in variables.

sort_asc <- "var2"
sort_desc <- c("var3","var4")

What's the best way to do this in base R?

Updated details

This is the output if sorted ascending by 'var2' first (step 1) and then descending by 'var3' and 'var4' (as step 2).

var1   var2 var3 var4
a      l    x    5
b      k    x    3
b      k    x    3
a      k    t    5
b      l    t    5
a      l    t    3

But what I am looking for is doing all three sort at the same time to get this:

var1 var2 var3 var4
b    k    x    3
b    k    x    3
a    k    t    5
a    l    x    5
b    l    t    5
a    l    t    3

'var2' is ascending (k,l), within k and within l, 'var3' is descending, and similarly 'var4' is descending

To clarify, how this question is different from other data.frame ordering questions...

  • ordering on multiple columns
  • column names to order on are stored in variables
  • different ordering directions (asc,desc)
  • ordering is not step-wise (one sort after another) but rather simultaneous (all selected columns at same time)
  • using base R, not dplyr
mindlessgreen
  • 11,059
  • 16
  • 68
  • 113
  • It matters also the order of the variables you want to order by. For instance, you want to order by the ascending variables before and then by the descending? Look also at the `decreasing` argument of `order` which can be a vector. – nicola May 01 '18 at 12:44
  • @nicole I want to use all the variables at the same time. Not one after the other. – mindlessgreen May 01 '18 at 12:45
  • 1
    This doesn't make sense. When you order by two variables, rows are ordered by the first variable and, if the value of the first variable is the same, than the second variable breaks the ties. There is no "use all variables at the same time". – nicola May 01 '18 at 12:47
  • We need to understand how you want it to look when the different variables would but rows in different orders -- how do you want the nesting to work? – Elin May 01 '18 at 12:47
  • [R - order a data.frame by column name AS CHARACTER](https://stackoverflow.com/questions/6552640/r-order-a-data-frame-by-column-name-as-character); [Sort a data.frame by multiple columns whose names are contained in a single object?](https://stackoverflow.com/questions/16441952/sort-a-data-frame-by-multiple-columns-whose-names-are-contained-in-a-single-obje); [R - Ordering using do.call with descending order](https://stackoverflow.com/questions/40004217/r-ordering-using-do-call-with-descending-order) – Henrik May 01 '18 at 12:50
  • [A quick question on sequential sort/order in R program](https://stackoverflow.com/questions/6597218/a-quick-question-on-sequential-sort-order-in-r-program) – Henrik May 01 '18 at 12:51

1 Answers1

4

Step-wise ordering (sort ascending first and then descending).

dplyr solution:

library(dplyr)
df %>% 
   arrange_at(sort_asc) %>%
   arrange_at(sort_desc, desc)

  var1 var2 var3 var4
1    a    l    x    5
2    b    k    x    3
3    b    k    x    3
4    a    k    t    5
5    b    l    t    5
6    a    l    t    3

base R solution:

With base R, if there are multiple columns (in general) use order within do.call. Here, we create the index for ascending order first, then sort it descnding with the second set of columns ('sort_desc')

i1 <- do.call(order, df[sort_asc]) 
df1 <- df[i1,]
i2 <-  do.call(order, c(df1[sort_desc], list(decreasing = TRUE)))
df1[i2,]

  var1 var2 var3 var4
5    a    l    x    5
3    b    k    x    3
6    b    k    x    3
4    a    k    t    5
1    b    l    t    5
2    a    l    t    3

Simultaneous/Sequential ordering (all ordering variables are used in one ordering step):

dplyr solution:

df %>% 
   arrange_(.dots  = c(sort_asc, paste0("desc(", sort_desc, ")")))

#   var1 var2 var3 var4
#1    b    k    x    3
#2    b    k    x    3
#3    a    k    t    5
#4    a    l    x    5
#5    b    l    t    5
#6    a    l    t    3

base R solution:

With base R, if we need the similar output as with arrange_

df[do.call(order, c(as.list(df[sort_asc]), lapply(df[sort_desc], 
               function(x) -xtfrm(x)))),]

#  var1 var2 var3 var4
#3    b    k    x    3
#6    b    k    x    3
#4    a    k    t    5
#5    a    l    x    5
#1    b    l    t    5
#2    a    l    t    3
mindlessgreen
  • 11,059
  • 16
  • 68
  • 113
akrun
  • 874,273
  • 37
  • 540
  • 662