2

My files are of the format:

ada1
ada2
ada3
....
ada10
ada11
ada12

Unfortunately, when I write out a10,a11 and a12 comes before a2. Could you help me sort it alphabetically as it should be?

#

Edit

Now, I have thousands of these files. Basically, xyz1-12, abc1-12 etc.

I use the following to get all files:

GG <- grep("*.txt", list.files(), value = TRUE)

So I can't put 'ada' manually.

Geekuna Matata
  • 1,349
  • 5
  • 19
  • 38

4 Answers4

2

If there are always three characters, you can sort independently by those characters, followed by a numeric sort of the rest of the string:

GG <- paste0(c('ada', 'xyz'), 1:20) # Synthesis of data similar to what your command would give

Using order with multiple arguments gives the permutation of the vector, then indexing by that permutation returns the data in the desired sort order:

GG[order(substring(GG, 1, 3), as.numeric(substring(GG, 4)))]
 [1] "ada1"  "ada3"  "ada5"  "ada7"  "ada9"  "ada11" "ada13" "ada15" "ada17" "ada19" "xyz2"  "xyz4"  "xyz6"  "xyz8"  "xyz10"
[16] "xyz12" "xyz14" "xyz16" "xyz18" "xyz20"
Matthew Lundberg
  • 42,009
  • 6
  • 90
  • 112
1

Another way using package gtools:

require(gtools)
x <- paste0('a', 1:12)
mixedsort(x)
[1] "a1"  "a2"  "a3"  "a4"  "a5"  "a6"  "a7"  "a8"  "a9"  "a10" "a11" "a12"
James King
  • 6,229
  • 3
  • 25
  • 40
1

If you can't change their names to something better (that is ada001, ada002...) then you could create an double index. I am assuming fnames is a vector with the file names, and the numbers are only preceded by a fixed number of letters.

alpha <- substr(fnames, 1, 3)
num <- as.integer(substr(fnames, 4, nchar(fnames)))

o <- order(alpha, num)   ## that's your sorting vector

You can modify this procedure to accommodate a varying number of letters using regular expressions to find the split.

ilir
  • 3,236
  • 15
  • 23
0

If you can change the file names you could do something like the following:

names0 <- paste0("a", 1:20)
temp <- strsplit(names0, "a")
ind <- sapply(temp, "[[", 2)
names1 <- paste0("a", sprintf("%03d", as.numeric(ind)))

> names1
[1] "a001" "a002" "a003" "a004" "a005" "a006"
[7] "a007" "a008" "a009" "a010" "a011" "a012"
[13] "a013" "a014" "a015" "a016" "a017" "a018"
[19] "a019" "a020"

You may have to tweak the call to sprintf, based on this answer.

Just to clarify, using file.rename, it would be pretty easy to rename all your files.

Community
  • 1
  • 1
dayne
  • 7,504
  • 6
  • 38
  • 56