2

Im reading in some files from a directory using glob.glob, these files are named as such: 1.bmp

The files/names continue in this naming pattern: 1.bmp, 2.bmp, 3.bmp ... and so on

This is the code that i currently have, however whilst technically this does sort, it isnt as expected. files= sorted(glob.glob('../../Documents/ImageAnalysis.nosync/sliceImage/*.bmp'))

This method sorts as such:

../../Documents/ImageAnalysis.nosync/sliceImage/84.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/85.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/86.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/87.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/88.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/89.bmp

../../Documents/ImageAnalysis.nosync/sliceImage/9.bmp

../../Documents/ImageAnalysis.nosync/sliceImage/90.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/91.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/92.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/93.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/94.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/95.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/96.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/97.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/98.bmp
../../Documents/ImageAnalysis.nosync/sliceImage/99.bmp

In the above code i have highlighted the problem really, it is able to sort the file names well for e.g 90-99.bmp is completely fine however between 89.bmp and 90.bmp there is the file 9.bmp this obviously shouldnt be there and should be near the start

The sort of output that im expecting is like this:

1.bmp
2.bmp
3.bmp
4.bmp
5.bmp
6.bmp
...
10.bmp
11.bmp
12.bmp
13.bmp
...

and so on until the end of the files

Is this possible to do with glob?

Neil Houston
  • 131
  • 2
  • 14
  • 1
    You could ask it sort based on the number like `sorted(items, key=lambda x: int(re.findall(r'\d+', x)[0]))` ? Untested! – han solo Mar 26 '19 at 12:35
  • Possible duplicate of [Does Python have a built in function for string natural sort?](https://stackoverflow.com/questions/4836710/does-python-have-a-built-in-function-for-string-natural-sort) – SuperShoot Mar 26 '19 at 12:35
  • @hansolo thanks! that did the trick! – Neil Houston Mar 26 '19 at 12:43

2 Answers2

2

That is because files as sorted based on their names (which are strings), and they are sorted in lexicographic order. Check [Python.Docs]: Sorting HOW TO for more sorting related details.
For things to work as you'd expect, the "faulty" file 9.bmp should be named 09.bmp (this applies to all such files). If you'd have more than 100 files, things would be even clearer (and desired file names would be 009.bmp, 035.bmp).

Anyway, there is an alternative (provided that all of the files follow the naming pattern), by converting the file's base name (without extension - check [Python.Docs]: os.path - Common pathname manipulations) to an int, and sort based on that (by providing key to [Python.Docs]: sorted(iterable, *, key=None, reverse=False))

files = sorted(glob.glob("../../Documents/ImageAnalysis.nosync/sliceImage/*.bmp"), key=lambda x: int(os.path.splitext(os.path.basename(x))[0]))
CristiFati
  • 38,250
  • 9
  • 50
  • 87
1

Not with glob.glob. It returns a list unsorted or sorted according to the rules of the underlying system.

What you need to do is provide a suitable key function to sorted, to define the ordering you want, rather than as plain text strings. Something like (untested code):

def mysorter( x):
   path, fn = os.path.split( x)
   fn,ext = os.path.splitext( fn)
   if fn.isdigit():
       fnn = int(fn)
       fn = f'{fnn:08}'  # left pad with zeros
   return f'{path}/{fn}.{ext}'

Then

   results=sorted( glob.glob(...), key=mysorter )
nigel222
  • 7,582
  • 1
  • 14
  • 22