8

What I'm trying to accomplish is a C# application that will read logs from the Windows Event Logs and store them somewhere else. This has to be fast, since some of the devices where it will be installed generate a high amount of logs/s.

I have tried three approaches so far:

Local WMI: it didn't work good, there are too many errors and exceptions caused by the size of the collections that need to be loaded. EventLogReader: I though this was the perfect solution, since it allows you to query the event log however you like by using XPath expressions. The problem is that when you want to get the content of the message for each log (by calling FormatDescription()) takes way too much time for long collections. E.g: I can read 12k logs in 0.11s if I just go over them. If I add a line to store the message for each log, it takes nearly 6 minutes to complete exactly the same operation, which is totally crazy for such a low number of logs. I don't know if there's any kind of optimization that might be done to EventLogReader in order to get the message faster, I couldn't find anything either on MS documentation nor on the Internet.

I also found that you can read the log entries by using a class called EventLog. However, this technology does not allow you to enter any kind of filters so you basically have to load the entire list of logs to memory and then filter it out according to your needs. Here's an example:

EventLog eventLog = EventLog.GetEventLogs().FirstOrDefault(el => el.Log.Equals("Security", StringComparison.OrdinalIgnoreCase));
var newEntries = (from entry in eventLog.Entries.OfType()
orderby entry.TimeWritten ascending
where entry.TimeWritten > takefrom
select entry);

Despite of being faster in terms of getting the message, the use of memory might be high and I don't want to cause any issues on the devices where this solution will get deployed.

Can anybody help me with this? I cannot find any workarounds or approaches to achieve something like this.

Thank you!.

  • `the use of memory might be high and I don't want to cause any issues on the devices where this solution will get deployed.` Meaning the user will use the application while its moving logs? I'd just use a Task to handle the work.. – Jevgeni Geurtsen Nov 22 '18 at 12:20
  • No, it just means that I don't want to get a SystemOutOfMemoryException or something like that on the service because of the high memory usage. I just want to know if there's some more efficient workaround than the ones I could find so far. – Lucas Álvarez Lacasa Nov 22 '18 at 12:29
  • This will basically run in a Windows Service, what I don't know about the third approach (EventLog) is if there's a way to filter the log collection without having to read everything in memory and then applying a LINQ query. – Lucas Álvarez Lacasa Nov 22 '18 at 12:31
  • 1
    I think you want to use the constructor of `EventLog`: `new EventLog("Security")`. Also, if it runs as a service, I would copy the existing Logs during the first run (on background), then hook the `EntryWrittenEventHandler` within `EventLog` in your service to handle newly created logs. – Jevgeni Geurtsen Nov 22 '18 at 12:40
  • So, you are basically telling me to subscribe to the EntryWrittenEventHandler for each one of the logfiles from where I need to read? And then every time this event fires I will have the newly created log and I can forward it to whatever I want to. Is that correct? – Lucas Álvarez Lacasa Nov 22 '18 at 12:58
  • Yes. To ellaborate; if you hook the `EntryWrittenEventHandler ` of `new EventLog("Security")` all newly written security events will be passed your handler. This doesn't change the fact that on the first run you'll need to iterate through all existing items though. – Jevgeni Geurtsen Nov 22 '18 at 13:36
  • Can you provide me a snippet to know how to do it? Why you say I need to iterate through existent elements first? Can't I just start taking the new ones from the moment I register? – Lucas Álvarez Lacasa Nov 22 '18 at 15:45
  • You dont have to, only if you want to access the already existing list. Could you edit your main post and show us what you have tried with the event handler? – Jevgeni Geurtsen Nov 22 '18 at 17:29
  • I'm gonna create a snippet and should you what I have accomplished – Lucas Álvarez Lacasa Nov 23 '18 at 12:13

2 Answers2

6

You can give the EventLogReader class a try. See https://learn.microsoft.com/en-us/previous-versions/bb671200(v=vs.90).

It is better than the EventLog class because accessing the EventLog.Entries collection has the nasty property that its count can change while you are reading from it. What is even worse is that the reading happens on an IO threadpool thread which will let your application crash with an unhandled exception. At least that was the case some years ago.

The EventLogReader also gives you the ability to supply a query string to filter for the events you are interested in. That is the way to go if you write a new application.

Here is an application which shows how you can parallelize reading:

using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Diagnostics;
using System.Diagnostics.Eventing.Reader;
using System.Linq;
using System.Threading.Tasks;

namespace EventLogReading
{
    class Program
    {
        static volatile bool myHasStoppedReading = false;

        static void ParseEventsParallel()
        {
            var sw = Stopwatch.StartNew();
            var query = new EventLogQuery("Application", PathType.LogName, "*");

            const int BatchSize = 100;

            ConcurrentQueue<EventRecord> events = new ConcurrentQueue<EventRecord>();
            var readerTask = Task.Factory.StartNew(() =>
            {
                using (EventLogReader reader = new EventLogReader(query))
                {
                    EventRecord ev;
                    bool bFirst = true;
                    int count = 0;
                    while ((ev = reader.ReadEvent()) != null)
                    {
                        if ( count % BatchSize == 0)
                        {
                            events.Enqueue(ev);
                        }
                        count++;
                    }
                }
                myHasStoppedReading = true;
            });

            ConcurrentQueue<KeyValuePair<string, EventRecord>> eventsWithStrings = new ConcurrentQueue<KeyValuePair<string, EventRecord>>();

            Action conversion = () =>
            {
                EventRecord ev = null;
                using (var reader = new EventLogReader(query))
                {
                    while (!myHasStoppedReading || events.TryDequeue(out ev))
                    {
                        if (ev != null)
                        {
                            reader.Seek(ev.Bookmark);
                            for (int i = 0; i < BatchSize; i++)
                            {
                                ev = reader.ReadEvent();
                                if (ev == null)
                                {
                                    break;
                                }
                                eventsWithStrings.Enqueue(new KeyValuePair<string, EventRecord>(ev.FormatDescription(), ev));
                            }
                        }
                    }
                }
            };

            Parallel.Invoke(Enumerable.Repeat(conversion, 8).ToArray());

            sw.Stop();
            Console.WriteLine($"Got {eventsWithStrings.Count} events with strings in {sw.Elapsed.TotalMilliseconds:N3}ms");
        }

        static void ParseEvents()
        {
            var sw = Stopwatch.StartNew();
            List<KeyValuePair<string, EventRecord>> parsedEvents = new List<KeyValuePair<string, EventRecord>>();
                
            using (EventLogReader reader = new EventLogReader(new EventLogQuery("Application", PathType.LogName, "*")))
            {
                EventRecord ev;
                while ((ev = reader.ReadEvent()) != null)
                {
                    parsedEvents.Add(new KeyValuePair<string, EventRecord>(ev.FormatDescription(), ev));
                }
            }

            sw.Stop();
            Console.WriteLine($"Got {parsedEvents.Count} events with strings in {sw.Elapsed.TotalMilliseconds:N3}ms");
        }

        static void Main(string[] args)
        {
            ParseEvents();
            ParseEventsParallel();
        }
    }
}
Got 20322 events with strings in 19,320.047ms
Got 20323 events with strings in 5,327.064ms

This gives a decent speedup of a factor 4. I needed to use some tricks to get faster because for some strange reason the class ProviderMetadataCachedInformation is not thread safe and uses internally a lock(this) around the Format method which defeats paralell reading. The key trick is to open the event log in the conversion threads again and then read a bunch of events of the query there via the event bookmark Api. That way you can format the strings independently.

Update1

I have landed a change in .NET 5 which increases performance by a factor three up to 20. See https://github.com/dotnet/runtime/issues/34568. You can also copy the EventLogReader class from .NET Core and use this one instead which will give you the same speedup.

The full saga is described by my Blog Post: https://aloiskraus.wordpress.com/2020/07/20/ms-performance-hud-analyze-eventlog-reading-performance-in-realtime/

Alois Kraus
  • 13,229
  • 1
  • 38
  • 64
  • You'll loop through the `EventLog.Entries` collection with a `for-loop`, so if the count would change while iterating, it won't matter. Also this problem is solved whenever OP uses the event to intercept the new event entries. I agree that for querying the `EventLogReader ` is better (not performance-wise) because it allows querying out of the box. I even think that because of the `for-loop`, `EventLog.Entries` is even faster than the `EventLogReader`. – Jevgeni Geurtsen Nov 23 '18 at 08:39
  • Thanks for the response!. Like I said before, the problem with EventLogReader is that you hit a huge performance slowdown when accessing the message of each log, through the "FormatDescription()" method. I know it's faster since you can query and retrieve in memory only the slides of logs that you might want, but on the other hand it takes too much time to process each log, so it's not an acceptable solution for what I have to accomplish. – Lucas Álvarez Lacasa Nov 23 '18 at 11:34
  • 1
    You can always format things with multiple threads that should give you a decent speedup if you can use them. – Alois Kraus Nov 23 '18 at 15:49
  • I don't know how you would do that, you can have a thread per logfile that you try to analyze, but that's the furthest I would go I think. What could increase the performance significantly is try to store the templates of the messages that FormatDescription looks for, but this seems to be way too complex to achieve. – Lucas Álvarez Lacasa Nov 26 '18 at 12:56
  • @LucasÁlvarezLacasa: I have updated my sample with parallel reading. – Alois Kraus Nov 26 '18 at 22:55
  • Alois, thanks for your reply. That approach seems better. However, you will still face the downtime of getting the log message through FormatDescription(). I'm working now with EntryWritten, basically taking each new event, validating that the datetime of that event is newer or the same than the last one that I got and pushing it to a ConcurrentQueue to be delivered later on. This seems to be the fastest approach so far. I will publish a POC as soon as I finish it. – Lucas Álvarez Lacasa Nov 27 '18 at 14:09
3

We discussed a bit about reading the existing logs in the comments, can access the Security-tagged logs by accessing:

 var eventLog = new EventLog("Security");
 for (int i = 0; i < eventLog.Entries.Count; i++)
 {
      Console.WriteLine($"{eventLog.Entries[i].Message}");
 }

This might not be the cleanest (performance-wise) way of doing it, but I doubt any other will be faster, as you yourself have already found out by trying out different techniques. A small edit duo to Alois post: EventLogReader is not faster out of the box than EventLog, especially when using the for-loop mechanism showed in the code block above, I think EventLog is faster -- it only accesses the entries inside the loop using their index, the Entries collection is just a reference, whereas while using the EventLogReader, it will perform a query first and loop through that result, which should be slower. As commented on Alois's post: if you don't need to use the query option, just use the EventLog variant. If you do need querying, use the EventLogReader as is can query on a lower level than you could while using EventLog (only LINQ queries, which is slower ofcourse than querying in while executing the look-up).

To prevent you from having this hassle again in the future, and because you said you are running a service, I'd use the EntryWritten event of the EventLog class:

    var eventLog = new EventLog("Security")
    {
        EnableRaisingEvents = true
    };
    eventLog.EntryWritten += EventLog_EntryWritten;

    // .. read existing logs or do other work ..

    private static void EventLog_EntryWritten(object sender, EntryWrittenEventArgs e)
    {
        Console.WriteLine($"received new entry: {e.Entry.Message}");
    }

Note that you must set the EnableRaisingEvents to true in order for the event to fire whenever a new entry is logged. It'll also be a good practice (also, performance-wise) to start a (for example) Task, so that the system won't lock itself while queuing up the calls to your event.

This approach works fine if you want to retrieve all newly created events, if you want to retrieve newly created events but use a query (filter) for these events, you can check out the EventLogWatcher class, but in your case, when there are no constraints, I'd just use the EntryWritten event because you don't need filters and for plain old simplicity.

Jevgeni Geurtsen
  • 3,133
  • 4
  • 17
  • 35
  • What if the service where this tasks are running stops? Is there a way to re enable this callback from a certain point? Or I will have to start again from that point and potentially lose logs in the middle?. – Lucas Álvarez Lacasa Nov 22 '18 at 18:14
  • What I'm trying to say is, suppose that I'm reading logs with the callback. Suddenly my service stops. If I start over again and register the callback, I will lose the logs that were generated in the middle. Am I right? – Lucas Álvarez Lacasa Nov 22 '18 at 18:15
  • Yes, those logs won't be logged through the event handler, although you could easily write some code that retrieves the logs in between the last log you did handle and the most recent one (logs are saved sequentially, indexed). – Jevgeni Geurtsen Nov 22 '18 at 18:24
  • Yeah, it looks like a good approach. I'm gonna give it a try. Thank you man!. I will post any news I might get. – Lucas Álvarez Lacasa Nov 22 '18 at 19:06
  • Jevgeni Geurtsen, I created a small POC with EventLogWatcher, but you end up getting an EventRecord, and thus, you have the slowdown of getting the message through "FormatDescription()". Looks like the most performant way to resolve this problem is using EventLog with OnEntryWritten callback. – Lucas Álvarez Lacasa Nov 23 '18 at 12:49
  • Jevgeni Geurtsen as far as I've been able to test, these events should be queued somewhere, since if I place a Thread.Sleep(20000) when processing one log, the next one will be processed after the thread wakes up, meaning that threre's only one thread to handle all the logs. What happens if you are not able to process and send each log as fast as you should? Where will they be queued by this class?. Thank you!. – Lucas Álvarez Lacasa Nov 23 '18 at 19:45
  • @LucasÁlvarezLacasa Thread.Sleep is not the way to go, I'd fire a new task for each incoming event or create some manager class, that runs on another thread, and handle things in there. – Jevgeni Geurtsen Nov 25 '18 at 14:27
  • Jevgeni Geurtsen can you please provide me a snippet with the basic architecture? I'm not sure how to do that. – Lucas Álvarez Lacasa Nov 26 '18 at 12:55
  • Just to clarify a little bit, I'm firing a new task per logfile to where I want to be listening to. In there, I have to do something like a "while(true)" and wait for new events to come. Is that correct? Despite of doing that, I don't know what happens with events that are not being processed instantaneously (that's why I said I'm using a Thread.Sleep to see what the next event is after the thread is awaken) , where are they going? Who queues them and where?. – Lucas Álvarez Lacasa Nov 26 '18 at 13:15
  • @LucasÁlvarezLacasa You can indeed start a new task per log entry being added. In this task you can read the event data and send it to another server. This is the sole responsibility of the task; reading and sending the data. Retrieving new entries is done in the event handler; we are just moving the 'long work operation' to a task, so your event handler won't get locked up. You might be interested in reading; https://stackoverflow.com/questions/30225476/task-run-with-parameters – Jevgeni Geurtsen Nov 26 '18 at 13:23
  • Jevgeni Geurtsen let's say that the process of sending these logs might fail because the server to where I have to place them is fallen. What might happen with the triggered events if I decide to use only one task for all of them and not a task per fired event? My doubt is, where are those events being queued by the Framework? Like I mentioned you, I just tested that by doing a Thread.Sleep(20000) and the next event that I got after that was the one supposed to be received. This means that the triggered events are being queued somewhere. Do you know about this detail?. Thanks!. – Lucas Álvarez Lacasa Nov 26 '18 at 19:03
  • Guys, I decided to use this approach. Does anyone of you know why I might be getting unset parameters in some of the messages when using this eventLog.EntryWritten? I.E, I get %%14608 on my messages. In the EventViewer the message is completed, but that's not the case when I retrieve it with this technology. – Lucas Álvarez Lacasa Dec 21 '18 at 14:18