0

Through os.listdir() I have created a list of hundreds of files picked up from a folder. All the filenames have the following pattern:

obj1__5
obj1__10
obj1__15
...
...
obj1__250
...
obj2__5
obj2__10
...
obj2__250
... and so on up to obj99

The files in the folder were ordered following this scheme, however when using os.listdir() I got a list ordered in this way:

obj1__0.png
obj1__10.png
obj1__100.png
obj1__105.png
...
obj1__145.png
obj1__15.png
obj1__150.png
obj1__155.png
...
obj1__190.png
obj1__195.png
obj1__20.png
obj1__200.png
obj1__205.png
... and so on

Is there any way to pick up the file in the same order they are displayed in the folder? Or perhaps any sorting function I can use to put them back in their proper order? Thanks

  • 2
    This isn't unique to sorting files - you just have a list of strings. You can write a `key` function to define what you want to sort by. – jonrsharpe Nov 02 '16 at 22:09
  • 2
    You want to do what's called natural sorting, you can read more about it [here](https://stackoverflow.com/questions/4836710/does-python-have-a-built-in-function-for-string-natural-sort) – Francisco Nov 02 '16 at 22:13
  • @jonrsharpe You are right. I just changed the title of the question, because it could potentially appliy to all strings. – Gianluca John Massimiani Nov 02 '16 at 22:18
  • Then what about picking them up in the same order they are displayed by the OS? Is there any way to do that? – Gianluca John Massimiani Nov 02 '16 at 22:20
  • 1
    "The way they are displayed by the OS" can vary a lot depending on what OS, what version, and even what part. Explorer? The command line? I believe on most modern file systems, filenames are stored as a tree of some kind and come out in a sorted order when traversed (usually a lexicographic sort) but there's no way to know that generically. – kindall Nov 02 '16 at 22:43

3 Answers3

3

A general-purpose natural sorting function is something like this:

import re

def naturalsort(name, digits=re.compile("([0-9]+)")):
    return [int(x) if x.isdigit() else x for x in digits.split(name)]

You get back a list that contains integer values of the runs of digits and string versions of the rest. You can use this as the key when sorting:

sorted(os.listdir(), key=naturalsort)

You might think that this would cause problems in Python 3 when you try to compare e.g. "abc.txt" with "123.txt", since trying to compare a str with an int is an error in Py3. It still works: because we're splitting on runs of digits, the first element of the key is '' for strings that start with a run of digits. Which puts numbered items before any alphabetic ones, as they should be. Another way to say it is the first element of the key is always a string (which might be empty), the second is always an integer, and so on alternating to the end of the string. Therefore Python is never trying to compare different types.

kindall
  • 178,883
  • 35
  • 278
  • 309
1

This should work for you.

import os
import re

def splitter(name):                                         
    reg = re.search("(\d+)__(\d+)", name)
    return (int(reg.group(1)), int(reg.group(2)))

files = map(lambda x: (x, splitter(x)[0], splitter(x)[1]), os.listdir())

temp = sorted(files, key = lambda x: (x[1], x[2]))   

sortedFiles = map(lambda x: x[0], temp)

The key argument to the sorted function essentially does a multi-argument sort, sorting by the first argument and then sorting on the second argument while respecting the first level of sorting.

zenofsahil
  • 1,713
  • 2
  • 16
  • 18
-1

You can try that:

>>> l = ['obj1__0.png', 'obj1__10.png', 'obj3__15.png', 'obj1__15.png', 'obj2__15.png', 'obj1__100.png']
>>>
>>> sorted(l, key=lambda x: (int(x.split('__')[0][3:]),int(x.split('__')[1].strip('.png'))))
['obj1__0.png', 'obj1__10.png', 'obj1__15.png', 'obj1__100.png', 'obj2__15.png', 'obj3__15.png']
coder
  • 12,832
  • 5
  • 39
  • 53