4

I'm new the c# and am writing a program that will monitor a folder for .xml files using fileSystemWatcher being called from a method called folderWatch . The .xml files contain an email address and a path to a image which once read will be emailed. The code I have works fine if I add only a few xml's at a time however when I trying to dump a large number into the folder fileSystemWatcher is not processing all of them. Please help point me in the right direction.

private System.IO.FileSystemWatcher m_Watcher;
public string folderMonitorPath = Properties.Settings.Default.monitorFolder;

    public void folderWatch()
    {
        if(folderMonitorPath != "")
        {
            m_Watcher = new System.IO.FileSystemWatcher();
            m_Watcher.Filter = "*.xml*";
            m_Watcher.Path = folderMonitorPath;
            m_Watcher.NotifyFilter = NotifyFilters.LastAccess | NotifyFilters.LastWrite
                                     | NotifyFilters.FileName | NotifyFilters.DirectoryName;
            m_Watcher.Created += new FileSystemEventHandler(OnChanged);
            m_Watcher.EnableRaisingEvents = true;
        }
    }

    public void OnChanged(object sender, FileSystemEventArgs e)
    {
        displayText("File Added " + e.FullPath);
        xmlRead(e.FullPath);
    }

read xml

    public void xmlRead(string path)
    {

        XDocument document = XDocument.Load(path);
        var photo_information = from r in document.Descendants("photo_information")
                                select new
                                {
                                    user_data = r.Element("user_data").Value,
                                    photos = r.Element("photos").Element("photo").Value,
                                };
        foreach (var r in photo_information)
        {
            if (r.user_data != "")
            {
                var attachmentFilename = folderMonitorPath + @"\" + r.photos;
                displayText("new user data " + r.user_data);
                displayText("attemting to send mail");
                sendemail(r.user_data, attachmentFilename);
            }
            else
            {
                displayText("no user data moving to next file");
            }
        }

send mail

public void sendemail(string email, string attachmentFilename)
    {
        //myTimer.Stop();

            MailMessage mail = new MailMessage();
            SmtpClient SmtpServer = new SmtpClient(smtpClient);

            mail.From = new MailAddress(mailFrom);
            mail.To.Add(email);
            mail.Subject = "test";
            mail.Body = "text";

            SmtpServer.Port = smtpPort;
        SmtpServer.Credentials = new System.Net.NetworkCredential("username", "password");
        SmtpServer.EnableSsl = true;
        // SmtpServer.UseDefaultCredentials = true;

        if (attachmentFilename != null)
            {
                Attachment attachment = new Attachment(attachmentFilename, MediaTypeNames.Application.Octet);
                ContentDisposition disposition = attachment.ContentDisposition;
                disposition.CreationDate = File.GetCreationTime(attachmentFilename);
                disposition.ModificationDate = File.GetLastWriteTime(attachmentFilename);
                disposition.ReadDate = File.GetLastAccessTime(attachmentFilename);
                disposition.FileName = Path.GetFileName(attachmentFilename);
                disposition.Size = new FileInfo(attachmentFilename).Length;
                disposition.DispositionType = DispositionTypeNames.Attachment;
                mail.Attachments.Add(attachment);
            }
        try
        {
            SmtpServer.Send(mail);
            displayText("mail sent");
        }
        catch (Exception ex)
        {
           displayText(ex.Message);

        }

    }
user3260707
  • 81
  • 1
  • 9
  • 2
    chances are its missing them because of the time spent doing all that code - thread it off and have a queue of files – BugFinder Oct 10 '17 at 14:57
  • You have to use the Error event to get FSW to tell you that you are doing it wrong. – Hans Passant Oct 10 '17 at 15:26
  • FSW is very error-prone. It'll randomly stop listening -- without any error communicated -- due to some filesystem events. If interested, I have an [Observable FileSystemWatcher](http://idcomlog.codeplex.com/SourceControl/latest#IdComLog.Reactive/FileSystem.cs) that makes it much easier to use reliably. – Cory Nelson Oct 10 '17 at 16:38

3 Answers3

4

First, FileSystemWatcher has internal limited buffer to store pending notifications. As per documentation:

The system notifies the component of file changes, and it stores those changes in a buffer the component creates and passes to the APIs. Each event can use up to 16 bytes of memory, not including the file name. If there are many changes in a short time, the buffer can overflow. This causes the component to lose track of changes in the directory

You can increase that buffer by setting InternalBufferSize to 64 * 1024 (64KB, max allowed value).

Next (and maybe even more important) is how this buffer is cleared. Your OnChanged handler is called and only when it is finished - notification is removed from that buffer. That means if you do a lot of work in a handler - buffer has much higher chance of being overflowed. To avoid this - do at little work as possible in OnChanged handler and do all heavy work in separate thread, for example (not production ready code, just for illustation purposes):

var queue = new BlockingCollection<string>(new ConcurrentQueue<string>());
new Thread(() => {
    foreach (var item in queue.GetConsumingEnumerable()) {
        // do heavy stuff with item
    }
}) {
    IsBackground = true
}.Start();
var w = new FileSystemWatcher();
// other stuff
w.Changed += (sender, args) =>
{
    // takes no time, so overflow chance is drastically reduced
    queue.Add(args.FullPath);
};

You are also not subscribed to the Error event of FileSystemWatcher so you have no idea when (and if) something goes wrong.

Evk
  • 98,527
  • 8
  • 141
  • 191
  • Thanks Evk after a little tweeking I've got it to work, seems not files are getting lost anymore. – user3260707 Oct 10 '17 at 17:34
  • @Evk I've a similar case, the change in my code look like `w.Changed += (sender, args) => { // Here I call a function which takes 4 paramters, PerformAction(string, string, string, int); // So is there a way to store all the 4 parameters somehow so that I can call PerformAction() from seperate thread. queue.Add(args.FullPath); };` – m_alpha Jan 23 '20 at 09:27
  • @m_alpha you can create separate class with 4 properties (arguments to PerformAction), and store instances of this class in queue instead of just one string. – Evk Jan 23 '20 at 10:27
  • @Evk Appreciate your response, I did something similar yesterday. Created a struct and then store its instance in queue instead of class. Wanted to ask if its okay to call `new Thread(() => { foreach (var item in queue.GetConsumingEnumerable()) { // do heavy stuff with item }` in the event handler itself i.e `w.Changed += (sender, args)` – m_alpha Jan 24 '20 at 05:16
  • @m_alpha in this case you will create a new thread on every change, so if there are 100 changes - there will be 100 threads all doing the same thing. You should do this outside of handler. – Evk Feb 06 '20 at 09:41
1

FSW's documentation warns that if event processing takes too long, some events may be lost. That's why it's always used with a queue and/or background processing.

One option is to use Task.Run to perform processing in the background :

public void OnChanged(object sender, FileSystemEventArgs e)
{
    _logger.Info("File Added " + e.FullPath);
    Task.Run(()=>xmlRead(e.FullPath));
}

Notice that I use logging instead of whatever displayText does. You can't access the UI thread from another thread. If you want to log progress, use a logging library.

You can also use the IProgress< T> interface to report progress of a long running job, or anything else that you want to publish through it. The Progress< T> implementation takes care to marshal the progress object to it parent thread, typically the UI thread.

An even better solution is to use ActionBlock< T>. An ActionBlock has an input buffer that can queue incoming messages and a DOP setting that allows you to specify how many operations can be performed concurrently. The default is 1 :

ActionBlock<string> _mailerBlock;

public void Init()
{
    var options=new ExecutionDataflowBlockOptions { 
        MaxDegreeOfParallelism = 5
     };
    _mailerBlock = new ActionBlock<string>(path=>xlmRead(path),options);
}

public void OnChanged(object sender, FileSystemEventArgs e)
{
    _logger.Info("File Added " + e.FullPath);
    _mailerBlock.Post(e.FullPath);
} 

Better yet, you can create differnt blocks for reading and emailing, and connect them in a pipeline. In this case the file reader generates a lot of emails, which means a TransformManyBlock is needed :

class EmailInfo 
{ 
    public string Data{get;set;}
    public string Attachement{get;set;}
}


var readerBlock = new TransformManyBlock<string,EmailInfo>(path=>infosFromXml(path));

var mailBlock = new ActionBlock<EmailInfo>(info=>sendMailFromInfo(info));

readerBlock.LinkTo(mailBlock,new DataflowLinkOptions{PropagateCompletion=true});

The xmlRead method should be changed into an iterator

public IEnumerable<EmailInfo> infosFromXml(string path)
{
    // Same as before ...
    foreach (var r in photo_information)
    {
        if (r.user_data != "")
        {
            ...
            yield return new EmailInfo{
                      Data=r.user_data, 
                      Attachment=attachmentFilename};
        }
       ...
    }
}

And sendmail to :

public void sendMailFromInfo(EmailInfo info)
{
    string email=info.Data;
    string attachmentFilename=info.Attachment;
}

When you want to terminate the pipeline you call Complete() on the head block and await for the tail's completion. This ensures that all remaining files will be processed :

readerBlock.Complete();
await mailerBlock.Completion;
Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
-1

I learnt the hard way that if you must use a reliable file monitor, use USN Journals.

https://msdn.microsoft.com/en-us/library/windows/desktop/aa363798(v=vs.85).aspx

Here is a way you could access it .NET if you have sufficient privileges: https://stackoverflow.com/a/31931109/612717

You could also implement it manually yourself with timer polling using the flie Length + LastModifiedDate.

Chibueze Opata
  • 9,856
  • 7
  • 42
  • 65
  • 1
    The modification date is itself unreliable and *slow*. FSW will work just fine if you don't *ab*use it. And .NET can't access the journal, unless you use a library like AlphaFS *AND* have admin rights in order to enable it for an entire volumne – Panagiotis Kanavos Oct 10 '17 at 15:34
  • And what you also don't realize is that it's lighter to use polling, it's sufficient to use last modified date + length to know if a file has changed in most cases. If super high accuracy is required, md5 hashes of first few or last few bits of file stream can be used. And you just need to know how to read the journal. No need for some huge library. – Chibueze Opata Oct 10 '17 at 19:45