3

I am having issues where I am seeing additional events that I am not expecting.

I am watching the folder C:\Users\kvasko\Downloads\data. If I copy a folder 2017\07\25\LogFile.xml I will see the following 3 "created" events, when I would expect to only see 1. If I create the date folder structure ahead of time (but while the application running watching the folders) it will only generate one event like I expect. I never get an event for just a folder creation. It is like the events are being generated for the creation of the folders, but when inspecting the actual event messaged on my on_created(self,event) all three look events look exactly the same. What is going on here?

Here is the sample output and minimum example.

2017-09-22 13:58:10,182 - root - INFO - Watchdog: file created C:\Users\kvasko\Downloads\data\2017\07\25\LogFile.xml
2017-09-22 13:58:11,184 - root - INFO - Watchdog: file created C:\Users\kvasko\Downloads\data\2017\07\25\LogFile.xml
2017-09-22 13:58:12,187 - root - INFO - Watchdog: file created C:\Users\kvasko\Downloads\data\2017\07\25\LogFile.xml

I would expect:

2017-09-22 13:58:12,187 - root - INFO - Watchdog: file created C:\Users\kvasko\Downloads\data\2017\07\25\LogFile.xml

Is there a way to detect if its actually multiple events from folder creation?

The following is my observer configuration.

folder = "C:\\Users\\kvasko\\Downloads\\data"
observer = Observer(MyProcessHandler(patterns=["*.xml"]), folder, recursive=True)
observer.start_observer()

os.mkdirs("C:\\Users\\kvasko\\Downloads\\data\\2017\\07\\25")
shutil.copy2("C:\temp\LogFile.xml", "C:\\Users\\kvasko\\Downloads\\data\\2017\\07\\25")

try:
    while True:
        time.sleep(5)
except:
    print("Error")

The following is my handler class.

import logging
from watchdog.events import PatternMatchingEventHandler

class MyProcessHandler(PatternMatchingEventHandler):

def on_created(self, event):
    logging.info("Watchdog: file created " + str(event.src_path))

Edit:

Here is a minimum working example:

import time
import os
import shutil
import datetime
from watchdog.observers import Observer
from watchdog.events import PatternMatchingEventHandler

class   TestEventHandler(PatternMatchingEventHandler):
    def on_created(self, event):
        print (str(datetime.datetime.now()) + " " + str(event))

if __name__ == '__main__':
    path = "C:\\Temp"
    event_handler = TestEventHandler(patterns=["*.xml"])
    observer = Observer()
    observer.schedule(event_handler, path, recursive=True)
    observer.start()

    os.makedirs("C:\\Temp\\2017\\07\\25")
    shutil.copy2("C:\\Temp2\\2017\\07\\25\\test.xml", "C:\\Temp\\2017\\07\\25")

    try:
       while True:
           time.sleep(1)
    except KeyboardInterrupt:
       observer.stop()
    observer.join()

Prints out:

2017-09-22 15:49:51.334262 <FileCreatedEvent: src_path='C:\\Temp\\2017\\07\\25\\test.xml'>
2017-09-22 15:49:52.335468 <FileCreatedEvent: src_path='C:\\Temp\\2017\\07\\25\\test.xml'>
2017-09-22 15:49:53.340998 <FileCreatedEvent: src_path='C:\\Temp\\2017\\07\\25\\test.xml'>

Edit2:

Change on_created() to on_any_event(). This is what was produced.

2017-09-23 13:14:57.288792 <FileCreatedEvent: src_path='C:\\Temp\\2017\\07\\25\\test.xml'>
2017-09-23 13:14:58.291327 <FileCreatedEvent: src_path='C:\\Temp\\2017\\07\\25\\test.xml'>
2017-09-23 13:14:59.293334 <FileCreatedEvent: src_path='C:\\Temp\\2017\\07\\25\\test.xml'>
2017-09-23 13:14:59.293334 <FileModifiedEvent: src_path='C:\\Temp\\2017\\07\\25\\test.xml'>
Kevin Vasko
  • 1,561
  • 3
  • 22
  • 45
  • Does your copy of "LogFile.xml" tales more the two seconds? – Laurent LAPORTE Sep 22 '17 at 19:54
  • @LaurentLAPORTE no, these files are like 100KB a piece. – Kevin Vasko Sep 22 '17 at 19:55
  • @KevinVasko: It would be very helpful to us if you modify your code to make it a runnable example. – unutbu Sep 22 '17 at 20:20
  • @KevinVasko It's weird because there are 3 events separated by 1 second from each other. – Laurent LAPORTE Sep 22 '17 at 20:42
  • @unutbu I edited the original question and added a MWE. – Kevin Vasko Sep 22 '17 at 20:52
  • @LaurentLAPORTE Yeah, I thought that was weird too. See my edit for a MWE. I didn't add any type of delay or anything but they are almost exactly 1 second apart. It doesn't do that but only when folders are created initially it seems. If you copy a file into the `C:\\Temp\\2017\\07\\25\\` folder only 1 file is displayed. It is almost like it is getting multiple events for the creation of folders if there aren't any files in them initially. I tried working around that by putting an empty file initially and it still did the triple message. – Kevin Vasko Sep 22 '17 at 20:55
  • @KevinVasko: Maybe changing `on_created` to `on_any_event` will show a more complete picture of what is going on. – unutbu Sep 22 '17 at 21:08
  • @unutbu I just added that information. Looks like the only thing that was changed was that it triggered a modified event. – Kevin Vasko Sep 23 '17 at 18:17
  • @KevinVasko: You might be experiencing this bug: https://github.com/gorakhargosh/watchdog/issues/346, which references [this question](https://stackoverflow.com/q/32861420/190597). This bug seems to affect Windows and OSX; I'm not able to reproduce it on Ubuntu 14.04 with watchdog version 0.8.3, and python 3.4.3. – unutbu Sep 23 '17 at 18:36

1 Answers1

6

You might be experiencing this bug. As a workaround, you could use the TestEventHandler class to record the last file path created and not respond to subsequent on_created events unless the path is different than the last created path or if that path has been deleted:

import time
import os
import shutil
import datetime
from watchdog.observers import Observer
from watchdog.events import PatternMatchingEventHandler

class TestEventHandler(PatternMatchingEventHandler):
    def __init__(self, *args, **kwargs):
        super(TestEventHandler, self).__init__(*args, **kwargs)
        self.last_created = None
    def on_created(self, event):
        path = event.src_path        
        if path != self.last_created:
            print(str(datetime.datetime.now()) + " " + str(event))
            self.last_created = path
    def on_deleted(self, event):
        path = event.src_path
        if path == self.last_created:
            self.last_created = None

if __name__ == '__main__':

    path = "C:\\Temp"
    target_dir = "C:\\Temp\\2017\\07\\25"
    src_dir = "C:\\Temp2\\2017\\07\\25"
    filename = 'test.xml'

    target = os.path.join(target_dir, filename)
    src = os.path.join(src_dir, filename)

    event_handler = TestEventHandler(patterns=["*.xml"])
    observer = Observer()
    observer.schedule(event_handler, path, recursive=True)
    observer.start()

    if not os.path.exists(target_dir):
        os.makedirs(target_dir)

    if os.path.exists(target):
        os.unlink(target)

    for i in range(3):
        shutil.copy2(src, target_dir)    

    try:
       while True:
           time.sleep(1)
    except KeyboardInterrupt:
       observer.stop()
    observer.join()
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • That looks to be the problem and your workaround works perfectly! Thanks a ton. I marked your solution as the answer. – Kevin Vasko Sep 25 '17 at 15:39