62

I am looping over all the files in a directory with the following command:

for i in *.fas; do some_code; done;

However, I get them in this order

vvchr1.fas  
vvchr10.fas  
vvchr11.fas
vvchr2.fas
...

instead of

vvchr1.fas
vvchr2.fas
vvchr3.fas
...

what is natural order.

I have tried sort command, but to no avail.

oguz ismail
  • 1
  • 16
  • 47
  • 69
Perlnika
  • 4,796
  • 8
  • 36
  • 47

7 Answers7

118
readarray -d '' entries < <(printf '%s\0' *.fas | sort -zV)
for entry in "${entries[@]}"; do
  # do something with $entry
done

where printf '%s\0' *.fas yields a NUL separated list of directory entries with the extension .fas, and sort -zV sorts them in natural order.

Note that you need GNU sort installed in order for this to work.

oguz ismail
  • 1
  • 16
  • 47
  • 69
catalin.costache
  • 3,123
  • 1
  • 25
  • 15
10

With option sort -g it compares according to general numerical value

 for FILE in `ls ./raw/ | sort -g`; do echo "$FILE"; done

0.log 1.log 2.log ... 10.log 11.log

This will only work if the name of the files are numerical. If they are string you will get them in alphabetical order. E.g.:

 for FILE in `ls ./raw/* | sort -g`; do echo "$FILE"; done

raw/0.log raw/10.log raw/11.log ... raw/2.log

gtangil
  • 717
  • 7
  • 7
4

You will get the files in ASCII order. This means that vvchr10* comes before vvchr2*. I realise that you can not rename your files (my bioinformatician brain tells me they contain chromosome data, and we simply don't call chromosome 1 "chr01"), so here's another solution (not using sort -V which I can't find on any operating system I'm using):

ls *.fas | sed 's/^\([^0-9]*\)\([0-9]*\)/\1 \2/' | sort -k2,2n | tr -d ' ' |
while read filename; do
  # do work with $filename
done

This is a bit convoluted and will not work with filenames containing spaces.

Another solution: Suppose we'd like to iterate over the files in size-order instead, which might be more appropriate for some bioinformatics tasks:

du *.fas | sort -k2,2n |
while read filesize filename; do
  # do work with $filename
done

To reverse the sorting, just add r after -k2,2n (to get -k2,2nr).

Kusalananda
  • 14,885
  • 3
  • 41
  • 52
2

You mean that files with the number 10 comes before files with number 3 in your list? Thats because ls sorts its result very simple, so something-10.whatever is smaller than something-3.whatever.

One solution is to rename all files so they have the same number of digits (the files with single-digit in them start with 0 in the number).

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • Yes. I see, thanks. However I did not name files, just downloaded them:) – Perlnika Nov 03 '11 at 09:50
  • 1
    Try `sort -n`. That sorts contiguous blocks of numbers in numerical order. While "10a" still precedes "1a", at least "1-a" precedes "10-a". – Ben Jun 24 '14 at 13:49
2
while IFS= read -r file ; do
    ls -l "$file" # or whatever
done < <(find . -name '*.fas' 2>/dev/null | sed -r -e 's/([0-9]+)/ \1/' | sort -k 2 -n | sed -e 's/ //;')

Solves the problem, presuming the file naming stays consistent, doesn't rely on very-recent versions of GNU sort, does not rely on reading the output of ls and doesn't fall victim to the pipe-to-while problems.

sorpigal
  • 25,504
  • 8
  • 57
  • 75
  • Thanks for the answer. You've forgotten a bracket in the end of <(find . -name '*.fas' 2>/dev/null | sed -r -e 's/([0-9]+)/ \1/' | sort -k 2 -n | sed -e 's/ //;') – Andrew Jan 30 '22 at 10:40
  • @Andrew: in over a decade you are the first one to notice. This has been fixed. – sorpigal Jan 31 '22 at 22:18
0

use sort -rh and the while loop

du -sh * | sort -rh | grep -P "avi$" |awk '{print $2}' | while read f; do fp=`pwd`/$f; echo $fp; done;
David Okwii
  • 7,352
  • 2
  • 36
  • 29
0

Like @Kusalananda's solution (perhaps easier to remember?) but catering for all files(?):

array=("$(ls |sed 's/[^0-9]*\([0-9]*\)\..*/\1 &/'| sort -n | sed 's/^[^ ]* //')")
for x in "${array[@]}";do echo "$x";done

In essence add a sort key, sort, remove sort key.

EDIT: moved comment to appropriate solution

potong
  • 55,640
  • 6
  • 51
  • 83
  • 1
    I'd rather stay away from non-standard flags. On Mac OSX, `ls -v` is "Force unedited printing of non-graphic characters; this is the default when output is not to a terminal.". On OpenBSD, it does not exist. – Kusalananda Nov 03 '11 at 11:22
  • I didn't say you did. I was just commenting on your mentioning of `ls -v`. – Kusalananda Nov 03 '11 at 13:04