209

What is the best way to get a list of all files in a directory, sorted by date [created | modified], using python, on a windows machine?

Liza
  • 2,192
  • 2
  • 13
  • 10

19 Answers19

196

I've done this in the past for a Python script to determine the last updated files in a directory:

import glob
import os

search_dir = "/mydir/"
# remove anything from the list that is not a file (directories, symlinks)
# thanks to J.F. Sebastion for pointing out that the requirement was a list 
# of files (presumably not including directories)  
files = list(filter(os.path.isfile, glob.glob(search_dir + "*")))
files.sort(key=lambda x: os.path.getmtime(x))

That should do what you're looking for based on file mtime.

EDIT: Note that you can also use os.listdir() in place of glob.glob() if desired - the reason I used glob in my original code was that I was wanting to use glob to only search for files with a particular set of file extensions, which glob() was better suited to. To use listdir here's what it would look like:

import os

search_dir = "/mydir/"
os.chdir(search_dir)
files = filter(os.path.isfile, os.listdir(search_dir))
files = [os.path.join(search_dir, f) for f in files] # add path to each file
files.sort(key=lambda x: os.path.getmtime(x))
Yash
  • 108
  • 9
Jay
  • 41,768
  • 14
  • 66
  • 83
  • glob() is nice, but keep in mind that it skips files starting with a period. *nix systems treat such files as hidden (thus omitting them from listings), but in Windows they are normal files. – efotinis Oct 03 '08 at 19:31
  • These solutions don't exclude dirs from list. – Constantin Oct 03 '08 at 21:00
  • Your os.listdir solution is missing the os.path.join: files.sort(lambda x,y: cmp(os.path.getmtime(os.path.join(search_dir,x)), os.path.getmtime(os.path.join(search_dir,y)))) – Peter Hoffmann Oct 04 '08 at 02:56
  • 2
    `files.sort(key=lambda fn: os.path.getmtime(os.path.join(search_dir, fn)))` – jfs Feb 11 '09 at 20:40
  • `files = filter(os.path.isfile, os.listdir(search_dir))` – jfs Feb 11 '09 at 20:44
  • Your solution doesn't sort by creation date as OP asks. See http://stackoverflow.com/questions/168409/how-do-you-get-a-directory-listing-sorted-by-creation-date-in-python/539024#539024 – jfs Feb 11 '09 at 22:05
  • @J.F. - the question actually asks "date [created | modified]" so mtime is a better choice than ctime. – Jay Feb 12 '09 at 15:43
  • @J.F. - thanks for pointing out the "key" param to sort, that was added in Python 2.4 and this code was originally on python 2.3 so I wasn't aware of it at the time. Learn something new every day! – Jay Feb 12 '09 at 16:27
  • 41
    A mere `files.sort(key=os.path.getmtime)` should work (without `lambda`). – jfs Dec 03 '09 at 19:01
  • Note: after `os.chdir(search_dir)`, you don't need `os.listdir(search_dir)`; you could use `os.listdir(os.curdir)` instead and therefore you don't need `os.path.join(search_dir, f)` either. You could replace the last 3 lines with this: `files = sorted(filter(os.path.isfile, os.listdir(os.curdir)), key=os.path.getmtime)` – jfs Jul 23 '15 at 15:20
  • In case of a large folder, and if one only wants the last file, there is no more efficient way of doing this, right? – FooBar Mar 01 '16 at 11:11
  • @FooBar to monitor a folder for new files, you could use the `watchdog` module. To find the file created last in the given directory only once, `max()` + `os.scandir()` or `os.listdir()` is enough. Here's [code example (text in Russian)](http://ru.stackoverflow.com/a/477013/23044) – jfs Oct 17 '16 at 23:44
  • How do I manipulate the time it gives me? For example, I want to look at the files that are older than one week? Is there a way to convert the output from os.path.getmtime(x) to a date? – M Waz Apr 09 '19 at 22:52
  • the os.chdir() is relevant although I have an absolute path. Welcome to Python! – Timo Nov 24 '20 at 17:21
  • you could do `files = [os.path.join(search_dir, f) for f in files if ".txt" in f]` to get only .txt files (example) - to get only files with specific extension – user20068036 Mar 06 '23 at 11:16
181

Update: to sort dirpath's entries by modification date in Python 3:

import os
from pathlib import Path

paths = sorted(Path(dirpath).iterdir(), key=os.path.getmtime)

(put @Pygirl's answer here for greater visibility)

If you already have a list of filenames files, then to sort it inplace by creation time on Windows (make sure that list contains absolute path):

files.sort(key=os.path.getctime)

The list of files you could get, for example, using glob as shown in @Jay's answer.


old answer Here's a more verbose version of @Greg Hewgill's answer. It is the most conforming to the question requirements. It makes a distinction between creation and modification dates (at least on Windows).

#!/usr/bin/env python
from stat import S_ISREG, ST_CTIME, ST_MODE
import os, sys, time

# path to the directory (relative or absolute)
dirpath = sys.argv[1] if len(sys.argv) == 2 else r'.'

# get all entries in the directory w/ stats
entries = (os.path.join(dirpath, fn) for fn in os.listdir(dirpath))
entries = ((os.stat(path), path) for path in entries)

# leave only regular files, insert creation date
entries = ((stat[ST_CTIME], path)
           for stat, path in entries if S_ISREG(stat[ST_MODE]))
#NOTE: on Windows `ST_CTIME` is a creation date 
#  but on Unix it could be something else
#NOTE: use `ST_MTIME` to sort by a modification date
        
for cdate, path in sorted(entries):
    print time.ctime(cdate), os.path.basename(path)

Example:

$ python stat_creation_date.py
Thu Feb 11 13:31:07 2009 stat_creation_date.py
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • 1
    This worked perfectly. I'm trying to compare two directories cdate with each other. Is there a way to compare the seconds between the two cdates? – Federer Jan 26 '12 at 15:25
  • @malcmcmul: `cdate` is a float number of seconds since Epoch. – jfs Jan 26 '12 at 18:20
  • 4
    This works but the most succinct solution is at http://stackoverflow.com/a/4500607/68534 – jmoz Jul 23 '15 at 11:12
  • @jmoz: do you mean like [this](http://stackoverflow.com/questions/168409/how-do-you-get-a-directory-listing-sorted-by-creation-date-in-python/168424#comment1735260_168424). The solution you've link is wrong: it doesn't filter regular files. Note: my solution calls `stat` once per dir.entry. – jfs Jul 23 '15 at 14:43
  • Forgive me, link provided by Sabastian is even more succinct! Thank you. – jmoz Jul 24 '15 at 14:22
  • paths = sorted(Path(directory).iterdir(), key=os.path.getmtime) File "/usr/lib/python2.7/genericpath.py", line 62, in getmtime return os.stat(filename).st_mtime TypeError: coercing to Unicode: need string or buffer, PosixPath found – Lava Sangeetham May 14 '21 at 18:14
  • @LavaSangeetham notice that the answer for the pathlib solution says Python 3, not Python 2.7 – jfs May 15 '21 at 18:38
43

There is an os.path.getmtime function that gives the number of seconds since the epoch and should be faster than os.stat.

import os 

os.chdir(directory)
sorted(filter(os.path.isfile, os.listdir('.')), key=os.path.getmtime)
daaawx
  • 3,273
  • 2
  • 17
  • 16
gypaetus
  • 6,873
  • 3
  • 35
  • 45
26

Here's my version:

def getfiles(dirpath):
    a = [s for s in os.listdir(dirpath)
         if os.path.isfile(os.path.join(dirpath, s))]
    a.sort(key=lambda s: os.path.getmtime(os.path.join(dirpath, s)))
    return a

First, we build a list of the file names. isfile() is used to skip directories; it can be omitted if directories should be included. Then, we sort the list in-place, using the modify date as the key.

efotinis
  • 14,565
  • 6
  • 31
  • 36
22

Here's a one-liner:

import os
import time
from pprint import pprint

pprint([(x[0], time.ctime(x[1].st_ctime)) for x in sorted([(fn, os.stat(fn)) for fn in os.listdir(".")], key = lambda x: x[1].st_ctime)])

This calls os.listdir() to get a list of the filenames, then calls os.stat() for each one to get the creation time, then sorts against the creation time.

Note that this method only calls os.stat() once for each file, which will be more efficient than calling it for each comparison in a sort.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
19

In python 3.5+

from pathlib import Path
sorted(Path('.').iterdir(), key=lambda f: f.stat().st_mtime)
ignorant
  • 1,390
  • 1
  • 10
  • 14
17

Without changing directory:

import os    

path = '/path/to/files/'
name_list = os.listdir(path)
full_list = [os.path.join(path,i) for i in name_list]
time_sorted_list = sorted(full_list, key=os.path.getmtime)

print time_sorted_list

# if you want just the filenames sorted, simply remove the dir from each
sorted_filename_list = [ os.path.basename(i) for i in time_sorted_list]
print sorted_filename_list
Nic
  • 171
  • 1
  • 3
15
from pathlib import Path
import os

sorted(Path('./').iterdir(), key=lambda t: t.stat().st_mtime)

or

sorted(Path('./').iterdir(), key=os.path.getmtime)

or

sorted(os.scandir('./'), key=lambda t: t.stat().st_mtime)

where m time is modified time.

Pygirl
  • 12,969
  • 5
  • 30
  • 43
11

Here's my answer using glob without filter if you want to read files with a certain extension in date order (Python 3).

dataset_path='/mydir/'   
files = glob.glob(dataset_path+"/morepath/*.extension")   
files.sort(key=os.path.getmtime)
dinos66
  • 686
  • 6
  • 15
9
# *** the shortest and best way ***
# getmtime --> sort by modified time
# getctime --> sort by created time

import glob,os

lst_files = glob.glob("*.txt")
lst_files.sort(key=os.path.getmtime)
print("\n".join(lst_files))
Arash
  • 91
  • 1
  • 1
5
sorted(filter(os.path.isfile, os.listdir('.')), 
    key=lambda p: os.stat(p).st_mtime)

You could use os.walk('.').next()[-1] instead of filtering with os.path.isfile, but that leaves dead symlinks in the list, and os.stat will fail on them.

Alex Coventry
  • 68,681
  • 4
  • 36
  • 40
4

For completeness with os.scandir (2x faster over pathlib):

import os
sorted(os.scandir('/tmp/test'), key=lambda d: d.stat().st_mtime)
n1nj4
  • 511
  • 5
  • 13
1

Alex Coventry's answer will produce an exception if the file is a symlink to an unexistent file, the following code corrects that answer:

import time
import datetime
sorted(filter(os.path.isfile, os.listdir('.')), 
    key=lambda p: os.path.exists(p) and os.stat(p).st_mtime or time.mktime(datetime.now().timetuple())

When the file doesn't exist, now() is used, and the symlink will go at the very end of the list.

Paolo Benvenuto
  • 385
  • 4
  • 15
1

This was my version:

import os

folder_path = r'D:\Movies\extra\new\dramas' # your path
os.chdir(folder_path) # make the path active
x = sorted(os.listdir(), key=os.path.getctime)  # sorted using creation time

folder = 0

for folder in range(len(x)):
    print(x[folder]) # print all the foldername inside the folder_path
    folder = +1
sɐunıɔןɐqɐp
  • 3,332
  • 15
  • 36
  • 40
haqrafiul
  • 39
  • 1
  • 7
  • In my code the files are sorted as oldest to newest. To get newest filenames or folders first, you need to add reverse = True in the file list (in my case it was x). so, x = sorted(os.listdir(), key=os.path.getctime, reverse=True) – haqrafiul Jun 03 '20 at 12:18
1

this is a basic step for learn:

import os, stat, sys
import time

dirpath = sys.argv[1] if len(sys.argv) == 2 else r'.'

listdir = os.listdir(dirpath)

for i in listdir:
    os.chdir(dirpath)
    data_001 = os.path.realpath(i)
    listdir_stat1 = os.stat(data_001)
    listdir_stat2 = ((os.stat(data_001), data_001))
    print time.ctime(listdir_stat1.st_ctime), data_001
Christian Specht
  • 35,843
  • 15
  • 128
  • 182
cumulus13
  • 96
  • 2
  • 11
0

Here is a simple couple lines that looks for extention as well as provides a sort option

def get_sorted_files(src_dir, regex_ext='*', sort_reverse=False): 
    files_to_evaluate = [os.path.join(src_dir, f) for f in os.listdir(src_dir) if re.search(r'.*\.({})$'.format(regex_ext), f)]
    files_to_evaluate.sort(key=os.path.getmtime, reverse=sort_reverse)
    return files_to_evaluate
TXN_747
  • 79
  • 4
0

Add the file directory/folder in path, if you want to have specific file type add the file extension, and then get file name in chronological order. This works for me.

import glob, os
from pathlib import Path
path = os.path.expanduser(file_location+"/"+date_file)  
os.chdir(path)    
saved_file=glob.glob('*.xlsx')
saved_file.sort(key=os.path.getmtime)

print(saved_file)
Aps
  • 214
  • 2
  • 9
-1

Turns out os.listdir sorts by last modified but in reverse so you can do:

import os
last_modified=os.listdir()[::-1]
Mayank
  • 1
  • 5
  • "Turns out os.listdir sorts by last modified but in reverse " - No, it doesn't. The doc clearly states: "os.listdir(path='.') Return a list containing the names of the entries in the directory given by path. The list is **in arbitrary order**" (emphasis mine) – Thierry Lathuille Sep 05 '21 at 20:03
-5

Maybe you should use shell commands. In Unix/Linux, find piped with sort will probably be able to do what you want.

stephanea
  • 1,108
  • 8
  • 10