-1

I would like to order a column containing characters like this:

K3SG1-105-1051-1

However, using the arrange function will result in this:

K3SG1-105-1051-1

K3SG1-105-1051-10

K3SG1-105-1051-100

K3SG1-105-1051-1000

Instead of what I want:

K3SG1-105-1051-1

K3SG1-105-1051-2

K3SG1-105-1051-3

K3SG1-105-1051-4

Thanks in advance.

tyluRp
  • 4,678
  • 2
  • 17
  • 36
MWhite
  • 11
  • 1

2 Answers2

0

Data

I created the following example data for this answer:

(char_vec <- paste0("K3SG1-105-1051-", c(1:4, 10, 100, 1000)))

[1] "K3SG1-105-1051-1"    "K3SG1-105-1051-2"    "K3SG1-105-1051-3"   
[4] "K3SG1-105-1051-4"    "K3SG1-105-1051-10"   "K3SG1-105-1051-100" 
[7] "K3SG1-105-1051-1000"

Solution

char_vec[order(as.numeric(sub('.*-', '', char_vec)))]

[1] "K3SG1-105-1051-1"    "K3SG1-105-1051-2"    "K3SG1-105-1051-3"   
[4] "K3SG1-105-1051-4"    "K3SG1-105-1051-10"   "K3SG1-105-1051-100" 
[7] "K3SG1-105-1051-1000"

Explanation

sub('.*-', '', char_vec) gets just the last number characters in the vector, which we then convert to numeric and order to order char_vec.

If you order the characters 1, 2, and 10, the order is 1, 10, 2 because you're alphabetically ordering strings, not ordering numbers.

duckmayr
  • 16,303
  • 3
  • 35
  • 53
0

Here is a possibility using tidyr::separate and dplyr:

# Sample data
df <- data.frame(id = paste0("K3SG1-105-1051-", seq(1:10)));

# Using separate
df %>%
    separate(id, into = paste0("id", 1:4), sep = "-", remove = FALSE) %>%
    arrange(as.numeric(id4)) %>%
    select(id);
#                  id
#1   K3SG1-105-1051-1
#2   K3SG1-105-1051-2
#3   K3SG1-105-1051-3
#4   K3SG1-105-1051-4
#5   K3SG1-105-1051-5
#6   K3SG1-105-1051-6
#7   K3SG1-105-1051-7
#8   K3SG1-105-1051-8
#9   K3SG1-105-1051-9
#10 K3SG1-105-1051-10

Explanation: Split column id into four separate columns based on "-" as separator; arrange rows based on the fourth column entries, which are converted to numeric to ensure proper ordering.

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68