0

I am looking for a suggestion please regarding sorting a data.frame with alphanumerical components:

Let's assume we have:

A = c("A1","A10","A11","A2")
B = c(1,2,3,4)

C = data.frame(A,B)

How could I sort C data.frame in such a way that we have at the end :

C$A in the order : "A1", "A2", "A10", "A11".

A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
Bogdan
  • 345
  • 1
  • 16

2 Answers2

3

Assuming that there is just a single alphabetic component at the start of each entry, a reasonable strategy is to sort by that letter first (using character (alphabetic) ordering), then by the numerical component (using numerical ordering.

(I am presuming that you might want to use this where the letter piece is not constant.)

You can do this with:

C[order(substr(A,1,1), as.numeric(substr(A, 2,length(A)))),]

If the strings are more general than 1 letter followed by a number, you could use regex to select the appropriate strings to order by.

tegancp
  • 1,204
  • 6
  • 13
0

You can try mixedorder from the "gtools" package. Here's what it does:

> library(gtools)
> mixedorder(as.character(C$A))
[1] 1 4 2 3

So, to sort by the "A" column:

C[mixedorder(as.character(C$A)), ]
##     A B
## 1  A1 1
## 4  A2 4
## 2 A10 2
## 3 A11 3

You also get the same ordering with:

order(nchar(as.character(C$A)))
## [1] 1 4 2 3
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485