2

I'm developing a small C# application that scans a log file for lines containing certain keywords and alerts the user when one of the keywords is found. This log is potentially extremely large (several gigabytes, in worst case scenario) but the only lines on the log that are relevant to me, are the ones added to the log while my application is running.

Is there a way I can capture each text line being appended to the file, without having to worry about the file content that was already present?

I already found out about the FileSystemWatcher class while searching for a solution, and while that seems great for notifying when I have new content to fetch from the log, it doesn't seem to help for telling me what was added to it.

John Saunders
  • 160,644
  • 26
  • 247
  • 397
Silen
  • 65
  • 5

4 Answers4

4

If you keep a FileStream open in Read mode (allowing writers, of course), you should be able to initially scan through the whole file and wait at the end until the FSW notifies you that the file has been modified.

Just be careful to reset your reading thread somehow if the file is deleted, for example if the log file that you are tailing gets rolled.

Here, I knocked together an example- run this, and while it is running, edit C:\Temp\Temp.txt in notepad and save it:

    public static void Main()
    {
        var lockMe = new object();
        using (var latch = new ManualResetEvent(true))
        using (var fs = new FileStream(@"C:\Temp\Temp.txt", FileMode.OpenOrCreate, FileAccess.Read, FileShare.ReadWrite))
        using (var fsw = new FileSystemWatcher(@"C:\Temp\"))
        {
            fsw.Changed += (s, e) =>
                               {
                                   lock (lockMe)
                                   {
                                       if (e.FullPath != @"C:\Temp\Temp.txt") return;
                                       latch.Set();
                                   }
                               };
            using (var sr = new StreamReader(fs))
                while (true)
                {
                    latch.WaitOne();
                    lock (lockMe)
                    {
                        String line;
                        while ((line = sr.ReadLine()) != null)
                            Console.Out.WriteLine(line);
                        latch.Set();
                    }
                }
        }
    }
Chris Shain
  • 50,833
  • 6
  • 93
  • 125
  • This is working great on a small text file, but taking several minutes on a 127 MB sample due to the initial scanning. I don't suppose there's any way to avoid going through the entire file? – Silen Feb 29 '12 at 03:09
  • 1
    Just add a `fs.Seek(0, SeekOrigin.End);` in there before you create the StreamReader. – Chris Shain Feb 29 '12 at 03:13
0

The most efficient solution (if your application needs it), is to write a file hook driver to capture all write access to to the file. That driver might tell you what bytes were changed. If you don't want to write the driver in C/C++, perhaps you can use EasyHook. EasyHook is great because, if you know the exact application that's writing to the log file, you can write a very simple user-mode hook (check his examples on CodePlex). If you don't know the name of the applications, you might have to write a kernel-hook (which is still easier with EasyHook).

Jason
  • 6,878
  • 5
  • 41
  • 55
0

In a similar way to this question, but you'll need to have the old file size recorded. Then instead of seeking back 10 newlines, just seek back the size difference. You'll have to be careful about encodings though.

Community
  • 1
  • 1
Matthew Finlay
  • 3,354
  • 2
  • 27
  • 32
0

Instead of reading the text from the file (what I assume you are doing), read the bytes of the file. If you can assume that writes to the file will always be appended, and you know the text encoding of the file, then you can just read in the bytes starting at the file size of the original file. Then convert the bytes to text using the proper encoding.

BryanJ
  • 8,485
  • 1
  • 42
  • 61
  • Hmm, the log is indeed always appended, so this seems like a solid solution to avoid running through the entire file. Going to give it a try. – Silen Feb 29 '12 at 03:17