2

I have a data frame that I am trying to sort by character's that contain numbers in it.

The column looks like this (Team1, Team2, Team3, Team4, Team5,....Team10) and when I sort it will sort it as (Team1, Team10, Team2 ....). I am using this with hundreds of different terms in that column so is there a way that you can sort the column so it sees the Team2 as an earlier value than Team3?

  • You can use `df1$col1 <- gtools::mixedsort(df1$col1)` – akrun May 27 '20 at 19:23
  • 1
    Alternatively, you could extract the team number and use that to sort (e. g. `df %>% mutate(team_no = str_extract(col1, "\\d+")) %>% arrange(team_no)`) – Owe Jessen May 27 '20 at 19:42

1 Answers1

1

Using base R:

set.seed(357)
xy <- paste("Team", sample(1:10), sep = "")

Sorting "the dumb way".

xy.sort <- sort(xy)
xy.sort

[1] "Team1"  "Team10" "Team2"  "Team3"  "Team4"  "Team5"  "Team6"  "Team7"  "Team8"  "Team9" 

If you extract the number and convert them to numeric, you can use them to order the original data.frame.

get.nums <- gsub("^Team(\\d+)$", replacement = "\\1", x = xy)
xy[order(as.numeric(get.nums))]

[1] "Team1"  "Team2"  "Team3"  "Team4"  "Team5"  "Team6"  "Team7"  "Team8"  "Team9"  "Team10"
Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197