3

I'm trying to write a Python script that runs on Windows. Files are copied to a folder every few seconds, and I'm polling that folder every 30 seconds for names of the new files that were copied to the folder after the last poll.

What I have tried is to use one of the os.path.getXtime(folder_path) functions and compare that with the timestamp of my previous poll. If the getXtime value is larger than the timestamp, then I work on those files.

I have tried to use the function os.path.getctime(folder_path), but that didn't work because the files were created before I wrote the script. I tried os.path.getmtime(folder_path) too but the modified times are usually smaller than the poll timestamp.

Finally, I tried os.path.getatime(folder_path), which works for the first time the files were copied over. The problem is I also read the files once they were in the folder, so the access time keeps getting updated and I end up reading the same files over and over again.

I'm not sure what is a better way or function to do this.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
Rayne
  • 14,247
  • 16
  • 42
  • 59

1 Answers1

1

You've got a bit of an XY problem here. You want to know when files in a folder change, you tried a homerolled solution, it didn't work, and now you want to fix your homerolled solution.

Can I suggest that instead of terrible hackery, you use an existing package designed for monitoring for file changes? One that is not a polling loop, but actually gets notified of changes as they happen? While inotify is Linux-only, there are other options for Windows.

Community
  • 1
  • 1
ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • 1
    I'd also add that polling for files that appear in a directory is a pretty bad way to reliably transfer information. It seems easy, which I presume is why it's so popular. But it is horribly bug-prone and literally impossible to implement reliably without some out-of-band communications to ensure files that "appear" in the directory are complete. Without that out-of-band communication, it's **impossible** to know that a new file in the directory is complete. Nevermind the problem of figuring out what files are the new ones. "Works most of the time" is not anything I **ever** want to deliver. – Andrew Henle Feb 18 '17 at 12:15