0

I have a data frame which I want to sort. However, the column I want to sort by is a combination of a letter and a number.

df <- data.frame(a = sample(paste0("C", 1:20)), b = sample(LETTERS[1:26],20))
df[order(df$a),]

Ordering likes this gives me C1, C10, C11, ..., C19, C2, C20, ...

What do I need to change in order to sort the column like this: C1, C2, C3, ...

Thank you. :-)

Revan
  • 2,072
  • 4
  • 26
  • 42

1 Answers1

1

Sorting mixed numeric/character data like this is a common problem. One option is to continue sorting as text, but to pad every entry with zeroes such that all entries have the same length, e.g 3. The library stringr has a function str_pad which can help here:

library(stringr)
df <- data.frame(a = sample(paste0("C", str_pad(c(1:20), 3, pad="0"))),
                 b = sample(LETTERS[1:26],20))
df[order(df$a),]

      a b
18 C001 R
11 C002 C
1  C003 O
2  C004 Q
10 C005 I
...

Demo

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360