1

I'm trying to get the time of last modification (os.stat.st_mtime) of a particular directory. My issue is I have added a few metadata files that are hidden (they start with .). If I use os.stat(directory).st_mtime I get the date at which I updated the metadata file, not the date that a non-hidden file was modified in the directory. I would like to get the most recent time of modification for all of the other files in the directory other than the hidden metadata files.

I figure it's possible to write my own function, something along the lines of:

for file in folder:
    if not file starts with '.':
        modified_times.append(os.path.getmtime('/path/to/file')
    last_time = most recent of modified_times

However, is it possible to do this natively in python? Or do I need to write my own function like the pseudocode above (or something like this question)?

wcarhart
  • 2,685
  • 1
  • 23
  • 44
  • 4
    A hidden file is simply a file whose name begins with a period and that is not by default shown by `ls`. In other words, it is a feature of `ls` (or any other file manager), not of the file. If you want to ignore hidden files, you have to "teach" Python by writing the code in your question or something similar. – DYZ Jun 20 '18 at 00:06
  • 1
    `modified_times=[os.path.getmtime(f) for f in folder if f[0]!='.']` – DYZ Jun 20 '18 at 00:08
  • 1
    What you've written is fine. One minor improvement you could make is to not build the list in memory (by passing a generator expression to the `max` function, or by just doing `if val >= max: max = val` in the loop instead of `append`, or whatever). But considering that directories rarely have more than a few thousand files, and the cost of 1000 stat calls vastly outweighs the cost of allocating a small list, I wouldn't worry about it. – abarnert Jun 20 '18 at 00:31
  • But meanwhile, if this is something you need to do often (like, say, every 20 minutes from now until the end of time), you may want to set up a kqueue/inotify/FSEvents/FindFirstChangeNotification watcher to update the time whenever a file is touched unless it's hidden, so you only need to do the scan once at startup and then never again (unless your program is restarted). – abarnert Jun 20 '18 at 00:35

1 Answers1

5

Your desired outcome is impossible. The most recent modification time of all non-hidden files doesn't necessarily correspond to the virtual "last modified time of a directory ignoring hidden files". The problem is that directories are modified when files are moved in and out of them, but the file timestamps aren't changed (the file was moved, but not modified). So your proposed solution is at best a heuristic; you can hope it's correct, but there is no way to be sure.

In any event, no, there is no built-in that provides this heuristic. The concept of hidden vs. non-hidden files is OS and file system dependent, and Python provides no built-in API that cares about the distinction. If you want to make a "last_modified_guess" function, you'll have to write it yourself (I recommend basing it on os.scandir for efficiency).

Something as simple as:

last_time = max(entry.stat().st_mtime for entry in os.scandir(somedir) if not entry.name.startswith('.'))

would get you the most recent last modified time (in seconds since the epoch) of your non-hidden directory entries.

Update: On further reflection, the glob module does include a concept of . prefix meaning "hidden", so you could use glob.glob/glob.iglob of os.path.join(somedir, '*') to have it filter out the "hidden" files for you. That said, by doing so, you give up some of the potential benefits of os.scandir (free or cached stat results, free type checks, etc.), so if all you need is "hidden" filtering, a simple .startswith('.') check is not worth giving that up.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • Works great for Python3, but forgot to mention that I am using Python2.7, which doesn't have `os.scandir()` – wcarhart Jun 20 '18 at 16:36
  • 1
    @wcarhart: [There is a third party PyPI package, `scandir`](https://pypi.org/project/scandir/), that provides the same API under the name `scandir.scandir()`, assuming you consider that worth it. Otherwise, you'll be stuck with `os.listdir` or `glob.glob` (backed by `os.listdir`), which should be fine, if slightly less friendly, unless you're talking about directories with thousands of files or more. – ShadowRanger Jun 20 '18 at 20:35