3

My question is closely related to Python identify file with largest number as part of filename

I want to append files to a certain directory. The name of the files are: file1, file2......file^n. This works if i do it in one go, but when i want to add files again, and want to find the last file added (in this case the file with the highest number), it recognises 'file6' to be higher than 'file100'.

How can i solve this.

import glob
import os

latest_file = max(sorted(list_of_files, key=os.path.getctime))
print latest_file

As you can see i tried looking at created time and i also tried looking at modified time, but these can be the same so that doesn't help.

EDIT my filenames have the extention ".txt" after the number

Community
  • 1
  • 1
Romano Vacca
  • 305
  • 1
  • 4
  • 11

2 Answers2

10

I'll try to solve it only using filenames, not dates.

You have to convert to integer before appling criteria or alphanum sort applies to the whole filename

Proof of concept:

import re
list_of_files = ["file1","file100","file4","file7"]

def extract_number(f):
    s = re.findall("\d+$",f)
    return (int(s[0]) if s else -1,f)

print(max(list_of_files,key=extract_number))

result: file100

  • the key function extracts the digits found at the end of the file and converts to integer, and if nothing is found returns -1
  • you don't need to sort to find the max, just pass the key to max directly
  • if 2 files have the same index, use full filename to break tie (which explains the tuple key)
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
3

Using the following regular expression you can get the number of each file:

import re

maxn = 0
for file in list_of_files:
    num = int(re.search('file(\d*)', file).group(1))  # assuming filename is "filexxx.txt"
    # compare num to previous max, e.g.
    maxn = num if num > maxn else maxn

At the end of the loop, maxn will be your highest filename number.

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
dirkgroten
  • 20,112
  • 2
  • 29
  • 42