1

So using the following piece of code I can easily find the most recently updated file in a folder:

files = os.listdir(UPLOAD_DIR+"/"+configData[obj]["client_name"])
paths = [os.path.join(UPLOAD_DIR+"/"+configData[obj]["client_name"], basename) for basename in files]
file = max(paths, key=os.path.getctime)

But what if there are two or more files that have the exact same updated time? How do I get a list of such files?

Mithil Bhoras
  • 694
  • 1
  • 7
  • 22
  • 2
    Totally unrelated, but hardcoding path seprators (`UPLOAD_DIR+"/"+configData[obj]["client_name"]) kind of defeats the whole point of `os.path.join()`. This should be `os.path.join(UPLOAD_DIR, configData[obj]["client_name"], basename)`. – bruno desthuilliers Oct 18 '18 at 12:12
  • 1
    Possible duplicate of [How to get all the maximums max function](https://stackoverflow.com/questions/10823227/how-to-get-all-the-maximums-max-function) – Georgy Oct 18 '18 at 12:42

3 Answers3

3

The shortest code: find the latest ctime, then get all files having this latest ctime:

def most_recent(paths):
    if not paths:
        return []
    latest_ctime = max(os.path.getctime(p) for p in paths)
    most_recent_files = [p for p in paths if os.path.getctime(p)==latest_ctime]
    return most_recent_files

We loop twice over the list of paths, though, and there is a risk of race condition if the ctime of the most recent file changes between the two loops: in this case, it wouldn't be found again in the second loop.

We can do it in one loop, with a little bit more code, eliminating the race condition:

def most_recent_one_loop(paths):
    out = []
    latest_ctime = 0
    for p in paths:
        ct = os.path.getctime(p)
        if ct > latest_ctime:
            latest_ctime = ct
            out = [p]
        elif ct == latest_ctime:
            out.append(p)
    return out

As we can expect, this is about twice as fast (about 100 paths in the folder for the test):

%timeit most_recent(paths)
# 1000 loops, best of 3: 477 µs per loop

%timeit most_recent_one_loop(paths)
# 1000 loops, best of 3: 239 µs per loop
Thierry Lathuille
  • 23,663
  • 10
  • 44
  • 50
0

Probably not the tidiest way of doing it but:

maxval = os.path.getctime(max(paths, key=os.path.getctime))

indices = [index for index, val in enumerate(paths) if os.path.getctime(val) == maxval]
for index in indices:
    print(paths[index])
Andrew McDowell
  • 2,860
  • 1
  • 17
  • 31
0

For Python 3, look like the max method can't fix your issue , as the Python 3 docs explicitly state:

If multiple items are maximal, the function returns the first one encountered. This is consistent with other sort-stability preserving tools such as sorted(iterable, key=keyfunc, reverse=True)[0] and heapq.nlargest(1, iterable, key=keyfunc).

You may need to use sorted command to find the multi max value

list = sorted(paths, key=os.path.getctime, reverse=True)
files=[]
for i in list:
  if os.path.getctime(list[0]) == os.path.getctime(i):
    files.append(i)
  else:
    break
Andy Wong
  • 3,676
  • 1
  • 21
  • 18