5

How do I sort a data.frame with one column?

I'm using the following:

> set.seed(456)
> df1 <- data.frame(col1 = runif(10))
> class(df1)
[1] "data.frame"
> df1 <- df1[order(df1$col1),]
> class(df1)
[1] "numeric"

However if I add a blank column things work fine:

> set.seed(456)
> df1 <- data.frame(col1 = runif(10))
> df1$dummy <- NA
> class(df1)
[1] "data.frame"
> df1 <- df1[order(df1$col1),]
> class(df1)
[1] "data.frame"
> df1 
         col1 dummy
7  0.08243274    NA
1  0.08955160    NA
2  0.21051232    NA
9  0.23750327    NA
8  0.28552695    NA
6  0.33195997    NA
10 0.38523617    NA
3  0.73295527    NA
5  0.78839789    NA
4  0.85213354    NA

Is there a better way to do this?

double-beep
  • 5,031
  • 17
  • 33
  • 41
screechOwl
  • 27,310
  • 61
  • 158
  • 267
  • Possible duplicate of http://stackoverflow.com/questions/13156448/how-can-i-sort-a-data-frame-with-only-one-column-without-losing-rownames/13156498#13156498 or http://stackoverflow.com/questions/6894246/how-to-sort-a-data-frame-in-r/6894362#6894362 – Henrik Jul 15 '15 at 14:46

4 Answers4

8

You could add drop=FALSE and it will work with most of the cases. The default option for [ is drop=TRUE

 df1[order(df1$col1),, drop=FALSE]

In the help page for ?`[`, the default arguments can be found in the 'Usage'

 x[i, j, ... , drop = TRUE]

and the description for drop as

drop: For matrices and arrays. If ‘TRUE’ the result is coerced to the lowest possible dimension (see the examples). This only works for extracting elements, not for the replacement. See ‘drop’ for further details.

akrun
  • 874,273
  • 37
  • 540
  • 662
2

With the package, you don't need the drop = FALSE:

library(data.table)
setorder(setDT(df1), col1)

which gives:

> df1
          col1
 1: 0.08243274
 2: 0.08955160
 3: 0.21051232
 4: 0.23750327
 5: 0.28552695
 6: 0.33195997
 7: 0.38523617
 8: 0.73295527
 9: 0.78839789
10: 0.85213354

Or directly on a dataframe without converting to a data.table:

library(data.table)
setorder(df1, col1)

which gives:

> df1
         col1
7  0.08243274
1  0.08955160
2  0.21051232
9  0.23750327
8  0.28552695
6  0.33195997
10 0.38523617
3  0.73295527
5  0.78839789
4  0.85213354
Jaap
  • 81,064
  • 34
  • 182
  • 193
0

You can also use dplyr.

library(dplyr)
df1 <- arrange(df1, col1)
class(df1)
[1] "data.frame"
user3274289
  • 2,426
  • 3
  • 16
  • 14
0

I'd recommend to split the sorting operation for the sake of resources optimization and code readability:

> positions <- order(df$col1)
# if you want to create a rank: positions <- order(df$col1, decreasing = TRUE)
# and then apply the order without modifying the dataframe
> sorted_df <- df[positions,]
Lucas Massuh
  • 221
  • 2
  • 9