1

I want to sort a vector of file names in R, just like how Windows sorts them when sorting by name (right-click --> sort by --> name).

Let's say I have four jpg files (I have more). Windows, when sorting by name, sorts them in a order as I've written below:

283_20200110_230606.jpg 500_20191203_032950.jpg 10889_20200114_165958.jpg 314368230_20200116_140854.jpg

R when using list.files(), would sort them like this: "10889_20200114_165958.jpg" "283_20200110_230606.jpg" "314368230_20200116_140854.jpg" "500_20191203_032950.jpg"

It seems in R character, the number 1 always comes first, whereas Windows compares the numbers before the first underscore.

Is there a way to make them sorted in the same way? Either sort in R as Windows would, or sort in Windows as R does?

EDIT:

data for testing:

v1 <- c("10889_20200114_165958.jpg", "283_20200110_230606.jpg", "314368230_20200116_140854.jpg", "500_20191203_032950.jpg")

wyatt
  • 371
  • 3
  • 13

3 Answers3

2

We can use mixedsort from gtools

gtools::mixedsort(v1)
#[1] "283_20200110_230606.jpg"       "500_20191203_032950.jpg"       "10889_20200114_165958.jpg"    
#[4] "314368230_20200116_140854.jpg"

data

v1 <- c("283_20200110_230606.jpg", "314368230_20200116_140854.jpg", 
"500_20191203_032950.jpg", "10889_20200114_165958.jpg")
akrun
  • 874,273
  • 37
  • 540
  • 662
1

The regex [0-9]{1,} will extract any number of digits 0-9 preceding the _ break in the vector of file names. Converting to a numeric, and sorting gives you the desired order.

library(tidyverse)

f <- c("283_20200110_230606.jpg", "314368230_20200116_140854.jpg", 
       "500_20191203_032950.jpg", "10889_20200114_165958.jpg")

tibble(f) %>% 
  mutate(prefix = as.numeric(str_extract(f, "[0-9]{1,}"))) %>% 
  arrange(prefix) %>% 
  pull(f)

[1] "283_20200110_230606.jpg"      
[2] "500_20191203_032950.jpg"      
[3] "10889_20200114_165958.jpg"    
[4] "314368230_20200116_140854.jpg"
Rich Pauloo
  • 7,734
  • 4
  • 37
  • 69
1

If you have Rtools installed then:

shell("dir/b | C:\\Rtools\\bin\\sort.exe -g", intern = TRUE)

or if you have wsl installed:

shell('wsl ls -1 | sort -g', intern = TRUE)

Note that if the natural order happens to be the order in which the files were created you could just do this:

shell("dir/b /od", intern = TRUE)
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341