0

I've made a log parser which parses log files containing a lot of rows. The current implementation uses a LinkedList<T> to build a list of entries and is quite fast.

I've also built a log viewer which uses a virtual list view. As you might understand it do not work very well with a linked list.

I'm thinking about using the LinkedList when doing the initial parsing and then allocate a List when done (with capacity specified). Then I simply use AddRange to add the log entries to the list.

More new items will be added later on since I use a FileSystemWatcher on the log file, but not in the same rate as the initial parsing.

Is it a good idea to switch or do you have any better suggestions?

Update

The parsing is done by something that implements the following interface (one parser per log format).

Public Interface ILogParser
    Sub Parse(ByVal stream As IO.Stream)
    Sub Parse(ByVal stream As IO.Stream, ByVal offset As Long)
    Event EntryParsed(ByVal sender As Object, ByVal e As ParsedEntryEventArgs)
    Event Completed(ByVal sender As Object, ByVal e As EventArgs)
End Interface

The logviewer subscribes on both events and adds each entry from the EntryParsed event to the LinkedList. The Completed event is triggered when the whole log (file)stream have been parsed.

When completed I start keeping track of the last position in the stream that was successfully parsed and the log parser method Parse(fileStream, lastPosition) is called each time a FileSystemWatcher event is triggered.

jgauffin
  • 99,844
  • 45
  • 235
  • 372
  • Why do you use `LinkedList` in the first place? Is it faster then `List`? Do you insert/delete items in the beginning that often? – CodesInChaos Mar 23 '11 at 11:22
  • A `LinkedList` do not allocate an internal array as `List` do. Hence, no need for the list to allocate a larger buffer when the current get's full. See: http://stackoverflow.com/questions/169973/when-should-i-use-a-list-vs-a-linkedlist – jgauffin Mar 23 '11 at 11:26
  • Describe the "parsing" better. It is hard to see for everybody how that could justify the choice for a LinkedList. – Hans Passant Mar 23 '11 at 12:20

3 Answers3

1

I have written alot of parsers, and i use List<> instead of LinkedList, what i normaly do is that i make a guess on the size of the List before i start parsing ( You can often guess from the size of the file or line count how big your List needs to be).

That being said, if using List makes you system have alot of OurOfMemory execptions i would swap to LinkedList<>.

Side note, when i say lots of lines i mean 200k +

EKS
  • 5,543
  • 6
  • 44
  • 60
0

LinkedList does not provide any benefit here since you do not have any requirement to "remove" or "insert" log entries. If logs are tiemstamp base, these would be "appended".

Shamit Verma
  • 3,839
  • 23
  • 22
0

I've switched from List<T> to LinkedList<T> when I parse the log file for the first time. When the first parsing is done I copy everything to a List<T> to be able to use it in my virtual list view.

The solution works great and the performance gain is a plus ;)

jgauffin
  • 99,844
  • 45
  • 235
  • 372