1

Consider:

> output<-cbind(matrix(sample(15,replace = TRUE),nrow=5,ncol=3),c(sample(5,replace = TRUE)+20),c(16,16,16,16,15))
> output
     [,1] [,2] [,3] [,4] [,5]
[1,]    5    8    3   25   16
[2,]    7    3    6   23   16
[3,]    7    9    7   21   16
[4,]    2    8   13   23   16
[5,]   11    1    3   22   15

Now suppose that I want to sort this by column 4, breaking ties by column 5. With order and a little help from Stack Overflow, this is not much of a challenge:

> output[order(output[,4],output[,5]),]
     [,1] [,2] [,3] [,4] [,5]
[1,]    7    9    7   21   16
[2,]   11    1    3   22   15
[3,]    7    3    6   23   16
[4,]    2    8   13   23   16
[5,]    5    8    3   25   16

My question is in a final requirement: What do I do if I want to sort my data further, by a function of any tied rows' entry in columns 1, 2 and 3? For example, how could I achieve the sort: "Sort by column 4 in increasing order. If there's a tie, sort by column 5 in increasing order. If there's also a tie in that, put the row with the lowest value in all of columns 1, 2, and 3 first (i.e. sort by min(col 1, col 2, col 3))"?

Expected output: In the above case, rows 3 and 4 of the final sort will be swapped because min(2,8,13) is less than min(7,3,6).

J. Mini
  • 1,868
  • 1
  • 9
  • 38
  • @akrun Made an edit, that should do. – J. Mini Jun 06 '20 at 18:48
  • One thing to note is that creating your example matrix using randomly generated matrices (using `sample` in this case) will result in the answers having different values in their data. To make it more reproducible you can set the seed (`set.seed`) beforehand, or generate the data once and use `dput` to paste an easily copyable matrix that will be consistent for everyone. – RyanFrost Jun 06 '20 at 20:34

2 Answers2

1

Here it is in base r:

output<-cbind(matrix(sample(15,replace = TRUE),nrow=5,ncol=3),
              c(sample(5,replace = TRUE)+20),c(16,16,16,16,15))


output[order(output[,4], output[,5], apply(output[, 1:3], 1, min)),]
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]   14   15    2   21   15
#> [2,]    9    6    4   22   16
#> [3,]   11   12   12   22   16
#> [4,]   13    5    7   25   16
#> [5,]   15   10    6   25   16

We use apply to find the vector of rowwise minimums of the first three columns, and treat that vector as a third sorting criteria.

If you're willing to work with dataframes instead, dplyr can make this much easier to read:

library(dplyr)
output %>%
  as.data.frame() %>% 
  arrange(V4, V5, pmin(V1, V2, V3))
#>   V1 V2 V3 V4 V5
#> 1 14 15  2 21 15
#> 2  9  6  4 22 16
#> 3 11 12 12 22 16
#> 4 13  5  7 25 16
#> 5 15 10  6 25 16

Created on 2020-06-06 by the reprex package (v0.3.0)

RyanFrost
  • 1,400
  • 7
  • 17
  • Instead of the `apply(...)` a `pmin(output[,1], output[,2], output[,3])` could also be used. – Bas Jun 06 '20 at 20:21
1

In base R, we can do

output1 <- as.data.frame(output)
output[do.call(order, c(output1[4:5], list(do.call(pmin, output1[1:3])))),]
#      [,1] [,2] [,3] [,4] [,5]
#[1,]    7    9    7   21   16
#[2,]   11    1    3   22   15
#[3,]    2    8   13   23   16
#[4,]    7    3    6   23   16
#[5,]    5    8    3   25   16

data

output <- cbind(c(5, 7, 7, 2, 11), c(8, 3, 9, 8, 1),
   c(3, 6, 7, 13, 3), c(25, 23, 21, 23, 22), c(16, 16, 16, 16, 15))
akrun
  • 874,273
  • 37
  • 540
  • 662