0

I have a folder where I have names as

file_1.txt,file_2.txt,file_3.txt,file_10.txt,file_100.txt.

I am reading these files using os.walk.i want print file names in a sorted order.My code is as follows:

import os
import fnmatch
rootDir = "lecture1"
for root, dirs, files in os.walk(rootDir):
   files = sorted(files)
   for file in fnmatch.filter(files, '*.wav'):
        print os.path.join(rootDir, file)

But the above code is not printing the file in a sorted order.please suggest me a way so that i can print it in a sorted order as follows:

file_1.txt,file_2.txt,file3_txt,file_10.txt,file_100.txt

Currently its printing

file_1.txt,file_1.txt,file_100.txt,file_2.txt,file_3.txt
Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
  • This has been answered here: http://stackoverflow.com/questions/4836710/does-python-have-a-built-in-function-for-string-natural-sort . Basically 'sorted' is does alphanum sorting algorithim and you want natural sorting so you got to do more work for that. Either you will need to use the `re` module and write a function to do sorting or use the `natsort` natural sorting third party library as explained in that link. – dopstar Nov 29 '15 at 10:29
  • related: [Python analog of natsort function (sort a list using a “natural order” algorithm)](http://stackoverflow.com/q/2545532/4279) – jfs Nov 29 '15 at 10:32

2 Answers2

0

It doesn't sorted the output because

files = sorted(files)

and files is file_1.txt, file_100.txt, etc.

But as the above example, file_1.txt or file_100.txt is string, and sorted thinks that file_2.txt > file_100.txt because '2' > '1' (note that '').

To explain this more clear:

>>> '2' > '100'
True
>>> 2 > 100
False
>>> int('2') > int('100')
False
>>> 

So you need use regex to get the number, covert it to int use int() function, and then set a sort key like the following code:

import os
import re
import fnmatch

rootDir = "lecture1"

for root, dirs, files in os.walk(rootDir):
   files.sort(key=lambda x: int(re.search('file_(\d+)\.txt', x).group(1)))
   for file in fnmatch.filter(files, '*.wav'):
        print os.path.join(rootDir, file)
Remi Guan
  • 21,506
  • 17
  • 64
  • 87
0
file_1.txt,file_1.txt,file_100.txt,file_2.txt,file_3.txt

this is lexicographic sort, you need to add custom comparator parse filename and compare numbers in the comparator

kain64b
  • 2,258
  • 2
  • 15
  • 27