2

I've created a modified watchdog example in order to monitor a file for .jpg photos that have been added to the specific directory in Windows.

import time
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler

paths = []

xp_mode = 'off'

class FileHandler(FileSystemEventHandler):

    def on_created(self, event):
        if xp_mode == 'on':
            if not event.is_directory and not 'thumbnail' in event.src_path:
                print "Created: " + event.src_path
                paths.append(event.src_path)

    def on_modified(self, event):
        if not event.is_directory and not 'thumbnail' in event.src_path:
            print "Modified: " + event.src_path
            paths.append(event.src_path)

if __name__ == "__main__":
    path = 'C:\\'
    event_handler = FileHandler()
    observer = Observer()
    observer.schedule(event_handler, path, recursive=True)
    observer.start()
    try:
        while True:
            time.sleep(1)
    except KeyboardInterrupt:
        observe r.stop()

    observer.join()

One of the things that I have noticed that when a file is added, both on_created and on_modified is called! To combat this problem, I decided to only use the on_modified method. However, I am starting to notice that this also causes multiple callbacks, but this time to the on_modified method!

Modified: C:\images\C121211-0008.jpg
Modified: C:\images\C121211-0009.jpg
Modified: C:\images\C121211-0009.jpg <--- What?
Modified: C:\images\C121211-0010.jpg
Modified: C:\images\C121211-0011.jpg
Modified: C:\images\C121211-0012.jpg
Modified: C:\images\C121211-0013.jpg

I cannot figure out for the life of me why this is happening! It doesn't seem to be consistent either. If anyone could shed some light on this issue, it will be greatly appreciated.

There was a similar post, but it was for Linux: python watchdog modified and created duplicate events

Community
  • 1
  • 1
user1927638
  • 1,133
  • 20
  • 42

1 Answers1

6

When a process writes a file, it first creates it, then writes the contents a piece at a time.

What you're seeing is a set of events corresponding to those actions. Sometimes the pieces are written quickly enough that Windows only sends a single event for all of them, and other times you get multiple events.

This is normal... depending on what the surrounding code needs to do, it might make sense to keep a set of modified pathnames rather than a list.

RichieHindle
  • 272,464
  • 47
  • 358
  • 399
  • So is it okay to leave the event duplication as it is? I would like to something for every trigger(like a lambda) so this type of thing may not be ideal. Are there any alternatives than storing it as a set? – elams Mar 08 '22 at 13:43
  • For other people: You want to debounce the `on_xyz` function. You can search for "python debounce function/decorator". This will buffer multiple quick calls until they slow down and the call the function only once. – RaiderB Jul 18 '22 at 19:28