7

I have to create a Windows service which monitors a specified folder for new files and processes it and moves it to other location.

I started with using FileSystemWatcher. My boss doesn't like FileSystemWatcher and wants me to use polling by using a Timer or any other mechanism other than FileSystemWatcher.

How can you monitor directorying without using FileSystemWatcher using .NET framework?

p.campbell
  • 98,673
  • 67
  • 256
  • 322
Abbi
  • 607
  • 3
  • 12
  • 23
  • 5
    Maybe this a test of when you should fight back against your boss' opinion... Replacing with a polling mechanism sounds like crazy talk – fletcher Aug 31 '10 at 19:11
  • 7
    Why doesn't your boss want you using FileSystem Watcher? The reason might point towards the better solution. If there isn't a reason, @Giorgi probably has the right answer. – Kendrick Aug 31 '10 at 19:11
  • 3
    Allegedly FileSystemWatcher doesn't work on network drives, and I personally frequently experience cases where it does not trigger. – Peter Morris Feb 20 '13 at 12:31
  • 9
    Not all events in the Windows world can be trusted and are reliable. Maybe he's working on a critical system and they found issues with FileSystemWatcher events in their environment? Sure sounds like it to me. It's easy to bash when you don't have a clue. – md1337 May 10 '13 at 16:22

8 Answers8

17

Actually, the FileWatcher component is not 100% "stable" from my experience over the years. Push enough files into a folder and you will lose some events. This is especially true if you monitor a file share, even if you increase the buffer size.

So, for all practical reasons, use FileWatcher together with a Timer which scans a folder for changes, for the most optimal solution.

Examples of creating Timer code should be in abundance if you google it. If you keep track of the last DateTime when the timer ran, then check the modified date of each file, and compare it to the date. Fairly simple logic.

The timer interval depends of how urgent the changes are for your system. But check every minute should be fine for many scenarios.

Loofer
  • 6,841
  • 9
  • 61
  • 102
Mikael Svenson
  • 39,181
  • 7
  • 73
  • 79
  • 4
    +1 This is the right solution: FileWatcher plus polling as a backup to handle what FileWatcher misses. -1 everyone who thinks FileWatcher is reliable. – Ian Mercer Aug 31 '10 at 21:37
  • 9
    Most of the `FileSystemWatcher` reliability issues come from a misunderstanding of how it works. The `Changed` event does not get raised when a write to the disk is queued, but instead gets raised only after the write has been committed. The write behind disk cache can impact the timeliness of the events by delaying them indefinitely. The reported solution is to flush the cache via `FlushFileBuffers`. I suspect there are other issues with network shares that have a negative impact on reliability. – Brian Gideon Sep 01 '10 at 19:01
  • 1
    @Brian: Very good explanation, and I haven't thought about the write cache. For network shares it's the buffer size of smb traffic which is too small to transfer all events, so some might get lost. – Mikael Svenson Sep 01 '10 at 19:32
  • 1
    I agree with using FileSystemWatcher + polling. Note that a lot of the "missed" events when dealing with several files are because the watcher creates a new thread for each event, and if that thread takes too long to execute the buffer will fill and new events will be discarded. The solution is to spawn a new thread inside the callback and do the work there, and not so anything else in the callback. – md1337 Aug 09 '13 at 19:54
  • Nice input @md1337. Thus increasing the buffers and spawning threads and possibly also change the max number of threads for your app might be the way to go to make event catching as "robust" as possible. – Mikael Svenson Aug 11 '13 at 17:50
  • I think Brian Gideon hit it on the nose. Changed event should be sufficient for slowly incoming files. This event keeps firing continuously, until the file write is complete. So far I have not seen a "miss" using this method. – I Stand With Russia Oct 28 '13 at 23:14
  • 1
    I know that this is a bit old, but I am a little confused on combining a polling method/timer with FileWatcher. What prevents both the poller and FileWatcher from attempting to process the same file? – Ron Jun 17 '15 at 15:29
6

Using @Petoj's answer I've included a full windows service that polls every five minutes for new files. Its contrained so only one thread polls, accounts for processing time and supports pause and timely stopping. It also supports easy attaching of a debbugger on system.start

 public partial class Service : ServiceBase{


    List<string> fileList = new List<string>();

    System.Timers.Timer timer;


    public Service()
    {
        timer = new System.Timers.Timer();
        //When autoreset is True there are reentrancy problems.
        timer.AutoReset = false;

        timer.Elapsed += new System.Timers.ElapsedEventHandler(DoStuff);
    }


    private void DoStuff(object sender, System.Timers.ElapsedEventArgs e)
    {
       LastChecked = DateTime.Now;

       string[] files = System.IO.Directory.GetFiles("c:\\", "*", System.IO.SearchOption.AllDirectories);

       foreach (string file in files)
       {
           if (!fileList.Contains(file))
           {
               fileList.Add(file);

               do_some_processing();
           }
       }


       TimeSpan ts = DateTime.Now.Subtract(LastChecked);
       TimeSpan MaxWaitTime = TimeSpan.FromMinutes(5);

       if (MaxWaitTime.Subtract(ts).CompareTo(TimeSpan.Zero) > -1)
           timer.Interval = MaxWaitTime.Subtract(ts).TotalMilliseconds;
       else
           timer.Interval = 1;

       timer.Start();
    }

    protected override void OnPause()
    {
        base.OnPause();
        this.timer.Stop();
    }

    protected override void OnContinue()
    {
        base.OnContinue();
        this.timer.Interval = 1;
        this.timer.Start();
    }

    protected override void OnStop()
    {
        base.OnStop();
        this.timer.Stop();
    }

    protected override void OnStart(string[] args)
    {
       foreach (string arg in args)
       {
           if (arg == "DEBUG_SERVICE")
                   DebugMode();

       }

        #if DEBUG
            DebugMode();
        #endif

        timer.Interval = 1;
        timer.Start();
   }

   private static void DebugMode()
   {
       Debugger.Break();
   }

 }
Community
  • 1
  • 1
Conrad Frix
  • 51,984
  • 12
  • 96
  • 155
  • 1
    Thanks for your reply. I am confused about how the Timer works. In the "do_some_processing" I have zipping and Encryption of large files and also database operations. Both are pretty time intensive. What happens when the operation say takes more than (just imagining) 5mins(the polling interval). Suppose processing took say 6mins, So if I am understanding your code correctly it will allow for 6 min for processing and then start the timer ? Before seeing ur code I was using a timer with the interval of 30 sec and so was confused what happens when the timer elapses in middle of processing – Abbi Sep 02 '10 at 20:42
  • Thanks Conrad, Finally I was able to create my service and your reply was pretty helpful. – Abbi Sep 08 '10 at 19:39
  • 3
    The code is missing the class wide `LastChecked` variable declaration. – hofnarwillie Jul 10 '13 at 00:51
  • @Conrad Frix I am trying to implement your solution but I am a little confused on how your timer is working. What is the purpose of the whole if "MaxWaitTime" conditional block? And why are you restarting the timer in the DoStuff method when you already start the timer in the OnStart method? And why set the timer interval to 1? – Datboydozy Oct 05 '22 at 19:27
  • @Datboydozy The purpose `timer.Interval = MaxWaitTime.Subtract(ts).TotalMilliseconds` is to account for the runtime in the interval. If we just set it directly without checking, we'd get an argument exception if the interval is set to less than zero. – Conrad Frix Oct 08 '22 at 00:49
4

You could use Directory.GetFiles():

using System.IO;

var fileList = new List<string>();

foreach (var file in Directory.GetFiles(@"c:\", "*", SearchOption.AllDirectories))
{
    if (!fileList.Contains(file))
    {
        fileList.Add(file);
        //do something
    }
}

Note this only checks for new files not changed files, if you need that use FileInfo

p.campbell
  • 98,673
  • 67
  • 256
  • 322
Peter
  • 37,042
  • 39
  • 142
  • 198
  • Thanks Petoj, your answer gave me the headstart I needed. – Abbi Sep 08 '10 at 19:37
  • I would like to do +1 but do not have enough reputation or am I just very new to this site as I do not know how to do it. – Abbi Sep 08 '10 at 21:18
  • 1
    @md1337 if the boss just says i don't like it so don't use it, then im my eyes he is weird? (at least he could tell him why he should not use it?) – Peter May 10 '13 at 16:36
  • @Petoj, maybe he did tell him that FileSystemWatcher is not reliable, which is true. You have to use it in conjunction with polling, or use polling alone. I personally use FileSystemWatcher AND polling. This way I get the best of both worlds: instant file detection and guaranteed detection. So the boss really had a point and he looks to have experience of FileSystemWatcher. – md1337 Aug 09 '13 at 19:34
4

At program startup, use Directory.GetFiles(path) to get the list of files.

Then create a timer, and in its elapsed event call hasNewFiles:

    static List<string> hasNewFiles(string path, List<string> lastKnownFiles)
    {
        List<string> files = Directory.GetFiles(path).ToList();
        List<string> newFiles = new List<string>();

        foreach (string s in files)
        {
            if (!lastKnownFiles.Contains(s))
                newFiles.Add(s);
        }

        return new List<string>();
    }

In the calling code, you'll have new files if:

    List<string> newFiles = hasNewFiles(path, lastKnownFiles);
    if (newFiles.Count > 0)
    {
        processFiles(newFiles);
        lastKnownFiles = newFiles;
    }

edit: if you want a more linqy solution:

    static IEnumerable<string> hasNewFiles(string path, List<string> lastKnownFiles)
    {
        return from f in Directory.GetFiles(path) 
               where !lastKnownFiles.Contains(f) 
               select f;
    }

    List<string> newFiles = hasNewFiles(path, lastKnownFiles); 
    if (newFiles.Count() > 0) 
    { 
        processFiles(newFiles); 
        lastKnownFiles = newFiles; 
    } 
Steven Evers
  • 16,649
  • 19
  • 79
  • 126
1

I would question why not to use the FileSystemWatcher. It registers with the OS and is notified immediately when the event finishes in the file system.

If you really have to poll, then just create a System.Timers.Timer, create a method for it to call, and check for the file in this method.

davisoa
  • 5,407
  • 1
  • 28
  • 34
1

Yes, you can create a Timer, and plug a handler into the Elapsed event that will instantiate a DirectoryInfo class for the directory you're watching, and call either GetFiles() or EnumerateFiles(). GetFiles() returns a FileInfo[] array, while EnumerateFiles() returns a "streaming" IEnumerable. EnumerateFiles() will be more efficient if you expect a lot of files to be in that folder when you look; you can start working with the IEnumerable before the method has retrieved all the FileInfos, while GetFiles will make you wait.

As to why this may actually be better than FileWatcher, it depends on the architecture behind the scenes. Take, for example, a basic Extract/Transform/Validate/Load workflow. First, such a workflow may have to create expensive instances of objects (DB connections, instances of a rules engine, etc). This one-time overhead is significantly mitigated if the workflow is structured to handle everything available to it in one go. Second, FileWatcher would require anything called by the event handlers, like this workflow, to be thread-safe, since MANY events can be running at once if files are constantly flowing in. If that is not feasible, a Timer can be very easily configured to restrict the system to one running workflow, by having event handlers examine a thread-safe "process running" flag and simply terminate if another handler thread has set it and not yet finished. The files in the folder at that time will be picked up the next time the Timer fires, unlike FileWatcher, where if you terminate the handler the information about the existence of that file is lost.

KeithS
  • 70,210
  • 21
  • 112
  • 164
0

1) Sounds like your boss is an idiot
2) You will have to use functions like Directory.GetFiles, File.GetLastAccessTime, etc and keep it in memory to check if it changed.

Fredrik Mörk
  • 155,851
  • 29
  • 291
  • 343
Wildhorn
  • 926
  • 1
  • 11
  • 30
0

It is a little odd that you cannot use FileSystemWatcher or presumably any of the Win32 APIs that do the same thing, but that is irrelevant at this point. The polling method might look like this.

public class WorseFileSystemWatcher : IDisposable
{
  private ManaulResetEvent m_Stop = new ManaulResetEvent(false);

  public event EventHandler Change;

  public WorseFileSystemWatcher(TimeSpan pollingInterval)
  {
    var thread = new Thread(
      () =>
      {
        while (!m_Stop.WaitOne(pollingInterval))
        {
          // Add your code to check for changes here.
          if (/* change detected */)
          {
            if (Change != null)
            {
              Change(this, new EventArgs())
            }
          }
        }
      });
    thread.Start();
  }

  public void Dispose()
  {
    m_Stop.Set();
  }
}
Brian Gideon
  • 47,849
  • 13
  • 107
  • 150