13

Is there a simple way to sort files in a directory in python? The files I have in mind come in an ordering as

file_01_001
file_01_005
...
file_02_002
file_02_006
...
file_03_003
file_03_007
...
file_04_004
file_04_008

What I want is something like

file_01_001
file_02_002
file_03_003
file_04_004
file_01_005
file_02_006
...

I am currently opening them using glob for the directory as follows:

for filename in glob(path):    
    with open(filename,'rb') as thefile:
        #Do stuff to each file

So, while the program performs the desired tasks, it's giving incorrect data if I do more than one file at a time, due to the ordering of the files. Any ideas?

Lou
  • 1,113
  • 3
  • 9
  • 22
  • Files don't have an order in which they are placed. They are sorted by file explorer of your choice by certain value, like name, file size, date added, etc. Thus, you cannot "do something" to files and make them sorted in your directory. – Božo Stojković Jun 13 '16 at 18:23
  • Please explain your custom order: you want files `file_0x_00x` (in order of increasing x) first, then `file_0x_00y` where y != x, in order of increasing x then increasing y? – smci Dec 21 '19 at 15:06

2 Answers2

36

As mentioned, files in a directory are not inherently sorted in a particular way. Thus, we usually 1) grab the file names 2) sort the file names by desired property 3) process the files in the sorted order.

You can get the file names in the directory as follows. Suppose the directory is "~/home" then

import os

file_list = os.listdir("~/home")

To sort file names:

#grab last 4 characters of the file name:
def last_4chars(x):
    return(x[-4:])

sorted(file_list, key = last_4chars)   

So it looks as follows:

In [4]: sorted(file_list, key = last_4chars)
Out[4]:
['file_01_001',
 'file_02_002',
 'file_03_003',
 'file_04_004',
 'file_01_005',
 'file_02_006',
 'file_03_007',
 'file_04_008']

To read in and process them in sorted order, do:

file_list = os.listdir("~/home")

for filename in sorted(file_list, key = last_4chars):    
    with open(filename,'rb') as thefile:
        #Do stuff to each file
Gene Burinsky
  • 9,478
  • 2
  • 21
  • 28
11

A much better solution is to use Tcl's "lsort -dictionary":

from tkinter import Tcl
Tcl().call('lsort', '-dict', file_list)

Tcl dictionary sorting will treat numbers correctly, and you will get results similar to the ones a file manager uses for sorting files.

  • This sorting principle is called "Natural Sorting". It can also easily be implemented by oneself in case you do not want to use this dependency. See e.g. [here](https://stackoverflow.com/questions/5967500/how-to-correctly-sort-a-string-with-a-number-inside). – jeanggi90 May 09 '21 at 19:23