1

I have a need to read - read the first and last lines of a log file in a .NET 4.5 application , that is.
The log file has timestamps on every line and I want to find the youngest (first line) and oldest (last line) timestamps. This isn't a difficult task but I'm wondering if there's a clever way of doing it.

Currently the implementation looks like this (I actually need the second line of the log file because the first line is blank, hence the Skip()):

string firstLine = File.ReadLines(logFile).Skip(1);
string lastLine = File.ReadLines(logFile).Last();

Can there be any improvements to this very simple code?

valsidalv
  • 761
  • 2
  • 19
  • 33
  • 1
    Define "improvements". The answer by Habib will work great for small files. For very large files it's going to be rather slow, though. But speeding it up is somethat involved. You end up reading the end of the file in binary and working backwards to the beginning of the last line. See http://stackoverflow.com/a/452945/56778 if you're really interested. – Jim Mischel Oct 29 '13 at 19:42
  • Skip(1) means it returns the rest of the elements after the first one. I guess you ment First()? – Measurity Oct 29 '13 at 19:55
  • @JimMischel Improvements in execution time, with memory a close second. Since it's always best to optimize for the most common case, I can say that the majority of files will be about ~5MB or less and in extreme cases something around 25MB (~200k lines). I can also have between 20-30 of these files to parse. I looked at backwards reading prior and agree that is looks 'rather tricky' and will avoid that if possible. The current solution isn't 'slow' either way. – valsidalv Oct 29 '13 at 19:56
  • If the majority of files are 5MB, then I'd probably use Habib's method. The few large files will take maybe 1/2 second each to load (if that). But if what you have is fast enough, then I'd go with that. – Jim Mischel Oct 29 '13 at 20:17

2 Answers2

6

Read once in an IEnumerable<string>, and then use that for second and last line.

var lines = File.ReadLines(logFile);
string firstLine = lines.Skip(1);
string lastLine = lines.Last();

In your current code you are doing the reading twice, If you expect the file to be modified between first read and second, then you have to read the file twice.

Habib
  • 219,104
  • 29
  • 407
  • 436
  • `lines` isn't an array, it is an `IEnumerable`. That's better than reading into an array, it just makes your documentation a little off. – Joel Rondeau Oct 29 '13 at 19:46
  • So what was happening when I would call `ReadLines` twice? Would the entire file be read twice? – valsidalv Oct 29 '13 at 20:11
  • 1
    @valsidav: No, the entire file isn't read twice. However, the system has to go through opening the file twice, which isn't exactly cheap. – Jim Mischel Oct 29 '13 at 20:18
1

Well the most obvious solution File.ReadLines has been found already.

An alternative, in case of big files, could be:

ReverseLineReader class from here: https://stackoverflow.com/a/452945/2254877

Community
  • 1
  • 1
Vladimir Gondarev
  • 1,243
  • 7
  • 14