5

I am about to implement the archetypal FileSystemWatcher solution. I have a directory to monitor for file creations, and the task of sucking up created files and inserting the into a DB. Roughly this will involve reading and processing 6 or 7, 80 char text files that appear at a rate of 150mS in bursts that occur every couple of seconds, and rarely a 2MB binary file will also have to be processed. This will most likely be a 24/7 process.

From what I have read about the FileSystemWatcher object it is better to enqueue its events in one thread and then dequeue/process them in another thread. The quandary I have right now is what would be the better creation mechanism of the thread that does the processing. The choices I can see are:

  1. Each time I get a FSW event I manually create a new thread (yeah I know .. stupid architecture, but I had to say it).

  2. Throw the processing at the CLR thread pool whenever I get an FSW event

  3. On start up, create a dedicated second thread for the processing and use a producer/consumer model to handle the work. The main thread enqueues the request and the second thread dequeues it and performs the work.

I am tending towards the third method as the preferred one as I know the work thread will always be required - and also probably more so because I have no feel for the thread pool.

Peter M
  • 7,309
  • 3
  • 50
  • 91

3 Answers3

3

The third option is the most logical.

In regards to FSW missing some file events, I implemented this: 1) FSW Object which fires on FileCreate 2) tmrFileCheck, ticks = 5000 (5 seconds) - Calls tmrFileChec_Tick

When the FileCreate event occurs, if (tmrFileCheck.Enabled == false) then tmrFileCheck.Start()

This way, after 10 seconds tmrFileCheck_Tick fires which a) tmrFileCheck.Stop() b) CheckForStragglerFiles

Of tests I've run, this works effectively where there are a < 100 files created per minute.

A variant is to merely have a timer tick ever NN seconds and sweep the directory(ies) for straggler files.

Another variant is to hire me to press F5 to refresh the window and call you when there are straggler files; just a suggestion. :-P

TechStuffBC
  • 169
  • 1
  • 3
3

If you know that the second thread will always be required, and you also know that you'll never need more than one worker thread, then option three is good enough.

Anon.
  • 58,739
  • 8
  • 81
  • 86
  • 1
    +1, I would add that using the thread pool will try and handle your requests simultaneously on multiple threads which doesn't sound like a good thing for your application. – John Knoeller Feb 22 '10 at 01:06
  • Anon .. From what testing I have done my processing should be done well and truly in the 150mS except in the case of the binary file processing - that will run at about 150mS but should be such a rare occurrence that there will be plenty of time to catch up if things get queued. – Peter M Feb 22 '10 at 03:20
2

Just be aware that FileSystemWatcher may miss events, there's no guarantee it will deliver all specific events that have transpired. Your design of keeping the work done by the thread receiving events to a minimum, should reduce the chances of that happening, but it is still a possibility, given the finite event buffer size (tops out at 64KB).

I would highly recommend developing a battery of torture tests if you decide to use FileSystemWatcher.

In our testing, we encountered issues with network locations, that changing the InternalBufferSize did not fix, yet when we encountered this scenario, we did not receive Error event notifications either.

Thus, we developed our own polling mechanism for doing so, using Directory.GetFiles, followed by comparing the state of the returned files with the previously polled state, ensuring we always had an accurate delta.

Of course, this comes at a substantial cost in performance, which may not be good enough for you.

Leon Breedt
  • 1,196
  • 8
  • 13
  • 1
    Leon, I'm well aware of the FSW limitations and issues. It seems not robust on network shares. I'm only going to be using on a local directory and I don't expect the FSW event buffer size will cause me problems. I'm sort of planning on a sweeper process for just in case I miss some things. – Peter M Feb 22 '10 at 03:14
  • Leon .. BTW I will be planning a lot of tests .. FSW seems to have a huge number of hidden gotchas. – Peter M Feb 22 '10 at 03:22
  • If I were doing this, I would go for FSW, and run a full sweep over the directory every now and then (maybe daily, at a time the system is normally quiet?) to make sure everything gets caught. – Anon. Feb 22 '10 at 03:25