1

Description of the script problem

Hello, I am new using Python, and I have a problem sorting my files. I have multiple files in text format from 0 to 20 but when I sort them they are coming in this order : 0, 1, 11, 12 ... despite of 0, 1, 2, 3 ... I tried multiple things found here but it is not working. Could you help me, please?

data_dir = 'Data_CO2/'
folder = '01-15-2020-B/'
dir_folder = data_dir+folder
files = os.listdir(dir_folder)
files_20 = []
for ff in files:
    if 'TPD' in ff: 
        files_20.append(ff)

files_20.sort()
files_20 

Output : 
'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-0.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-1.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-10.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-11.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-12.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-13.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-14.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-15.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-16.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-17.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-18.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-19.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-2.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-20.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-3.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-4.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-5.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-6.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-7.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-8.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-9.DPT'
  • related https://stackoverflow.com/q/2669059/7207392 – Paul Panzer Jul 06 '20 at 08:08
  • 1
    Hi! Please include the code/error messages as text and not as an image. – user2314737 Jul 06 '20 at 08:08
  • It looks like the default sort for files is alphabetical order. I have run into similar problems with the unix comand `sort`. I'm not familiar with working with files in python, but I would expect you can pass in a different comparison function when you call sort. – Riley Jul 06 '20 at 10:08

1 Answers1

2

Because sort is being handled a bunch of strings, it will sort alphabetically. If you want to sort based on the "index" instead, you can do:

def get_index(file_name: str):
    indexed_extension = file_name.split("-")[-1]
    index = indexed_extension.split(".")[0]

    return int(index)

followed by

files_20.sort(key=get_index)

This will work for any date (not just 01-15-2020). It relies on the file name having a -<index>.<extension> at the end.

Mario Ishac
  • 5,060
  • 3
  • 21
  • 52