I need to get the list of ".csv" files in a directory, sorted by creation date.
I use this function:
from os import listdir
from os.path import isfile, join, getctime
def get_sort_files(path, file_extension):
list_of_files = filter(lambda x: isfile(join(path, x)),listdir(path))
list_of_files = sorted(list_of_files, key=lambda x: getctime(join(path, x)))
list_of_files = [file for file in list_of_files if file.endswith(file_extension)] # keep only csv files
return list_of_files
It works fine when I use it in directories that contain a small number of csv files (e.g. 500), but it's very slow when I use it in directories that contain 50000 csv files: it takes about 50 seconds to return.
How can I modify it? Or can I use a better alternative function?
EDIT1:
The bottleneck is the sorted
function, so I must find an alternative to sort the files by creation date without using it
EDIT2:
I only need the oldest file (the first if sorted by creation date), so maybe I don't need to sort all the files. Can I just pick the oldest one?