18

need a snippet of code which would read out last "n lines" of a log file. I came up with the following code from the net.I am kinda new to C sharp. Since the log file might be quite large, I want to avoid overhead of reading the entire file.Can someone suggest any performance enhancement. I do not really want to read each character and change position.

   var reader = new StreamReader(filePath, Encoding.ASCII);
            reader.BaseStream.Seek(0, SeekOrigin.End);
            var count = 0;
            while (count <= tailCount)
            {
                if (reader.BaseStream.Position <= 0) break;
                reader.BaseStream.Position--;
                int c = reader.Read();
                if (reader.BaseStream.Position <= 0) break;
                reader.BaseStream.Position--;
                if (c == '\n')
                {
                    ++count;
                }
            }

            var str = reader.ReadToEnd();
skaffman
  • 398,947
  • 96
  • 818
  • 769
frictionlesspulley
  • 11,070
  • 14
  • 66
  • 115
  • You can't use a StreamReader like that. – SLaks Jan 06 '11 at 20:56
  • take a look at http://stackoverflow.com/questions/1271225/c-reading-a-file-line-by-line. You could then use LINQ extension `.Last()` on the IEnumerable to get the last N lines – Russ Cam Jan 06 '11 at 20:57
  • @Russ: No, you can't. LINQ cannot efficiently give you the last _n_ lines. – SLaks Jan 06 '11 at 20:58
  • @Slaks - oops! I thought there was an overload to get last N items... been a long day! Now that I think about it, it would require backtracking once at the end to get N items. – Russ Cam Jan 06 '11 at 21:01
  • 1
    http://stackoverflow.com/questions/398378/get-last-10-lines-of-very-large-text-file-10gb-c – CodesInChaos Jan 06 '11 at 21:06
  • See also: [Get last 10 lines of very large text file > 10GB c#](http://stackoverflow.com/questions/398378), [How to read a text file reversely with iterator in C#](http://stackoverflow.com/questions/452902), [Read from a file starting at the end, similar to tail](http://stackoverflow.com/questions/4368857) – hippietrail Nov 05 '12 at 18:28

9 Answers9

10

Your code will perform very poorly, since you aren't allowing any caching to happen.
In addition, it will not work at all for Unicode.

I wrote the following implementation:

///<summary>Returns the end of a text reader.</summary>
///<param name="reader">The reader to read from.</param>
///<param name="lineCount">The number of lines to return.</param>
///<returns>The last lneCount lines from the reader.</returns>
public static string[] Tail(this TextReader reader, int lineCount) {
    var buffer = new List<string>(lineCount);
    string line;
    for (int i = 0; i < lineCount; i++) {
        line = reader.ReadLine();
        if (line == null) return buffer.ToArray();
        buffer.Add(line);
    }

    int lastLine = lineCount - 1;           //The index of the last line read from the buffer.  Everything > this index was read earlier than everything <= this indes

    while (null != (line = reader.ReadLine())) {
        lastLine++;
        if (lastLine == lineCount) lastLine = 0;
        buffer[lastLine] = line;
    }

    if (lastLine == lineCount - 1) return buffer.ToArray();
    var retVal = new string[lineCount];
    buffer.CopyTo(lastLine + 1, retVal, 0, lineCount - lastLine - 1);
    buffer.CopyTo(0, retVal, lineCount - lastLine - 1, lastLine + 1);
    return retVal;
}
SLaks
  • 868,454
  • 176
  • 1,908
  • 1,964
  • 2
    really liked the idea of the shifting buffer. But wont this effectively read the entire log file. Is there an effective way to "seek" to the start of the nth line.and do a readLine() from there.This might be a dumb doubt of mine!! – frictionlesspulley Jan 06 '11 at 21:35
  • 2
    @frictionlesspulley: Try http://stackoverflow.com/questions/398378/get-last-10-lines-of-very-large-text-file-10gb-c/398512#398512 – SLaks Jan 06 '11 at 22:06
4

Had trouble with your code. This is my version. Since its' a log file, something might be writing to it, so it's best making sure you're not locking it.

You go to the end. Start reading backwards until you reach n lines. Then read everything from there on.

        int n = 5; //or any arbitrary number
        int count = 0;
        string content;
        byte[] buffer = new byte[1];

        using (FileStream fs = new FileStream("text.txt", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
        {
            // read to the end.
            fs.Seek(0, SeekOrigin.End);

            // read backwards 'n' lines
            while (count < n)
            {
                fs.Seek(-1, SeekOrigin.Current);
                fs.Read(buffer, 0, 1);
                if (buffer[0] == '\n')
                {
                    count++;
                }

                fs.Seek(-1, SeekOrigin.Current); // fs.Read(...) advances the position, so we need to go back again
            }
            fs.Seek(1, SeekOrigin.Current); // go past the last '\n'

            // read the last n lines
            using (StreamReader sr = new StreamReader(fs))
            {
                content = sr.ReadToEnd();
            }
        }
Maverick Meerkat
  • 5,737
  • 3
  • 47
  • 66
  • 1
    I like this solution to avoid reading the entire file but wanted to add checking fs.Position > 0 should be included to avoid seeking past the beginning position – stratocaster_master Jan 06 '21 at 13:53
  • This code works well, but but breaks with an System.IO.IOException if the number of lines requested is greater than the number of lines in the file – Sean N. May 05 '23 at 20:41
4

A friend of mine uses this method (BackwardReader can be found here):

public static IList<string> GetLogTail(string logname, string numrows)
{
    int lineCnt = 1;
    List<string> lines = new List<string>();
    int maxLines;

    if (!int.TryParse(numrows, out maxLines))
    {
        maxLines = 100;
    }

    string logFile = HttpContext.Current.Server.MapPath("~/" + logname);

    BackwardReader br = new BackwardReader(logFile);
    while (!br.SOF)
    {
        string line = br.Readline();
        lines.Add(line + System.Environment.NewLine);
        if (lineCnt == maxLines) break;
        lineCnt++;
    }
    lines.Reverse();
    return lines;
}
Jesse C. Slicer
  • 19,901
  • 3
  • 68
  • 87
  • 4
    **WHY** is `numrows` a string? – SLaks Jan 06 '11 at 21:10
  • Same question as SLaks, but +1 for `BackwardReader`. I didn't know about it. – BrunoLM Jan 06 '11 at 21:11
  • I'll be honest, SLaks, I can't find anything in my buddy's blog posting that explains why. I can see that it's essentially a WCF method called from JavaScript, but I'm not sure if that adequately explains it. – Jesse C. Slicer Jan 06 '11 at 21:13
  • That BackwardReader implementation is slow (since it doesn't buffer) and cannot support Unicode. – SLaks Jan 06 '11 at 22:15
  • I've just have a look at the blog and realize that he used ASCIIEncoding, so it won't work for unicode or any other encoding – phuclv Sep 24 '13 at 08:03
  • 2
    The link to the BackwardReader is not available anymore. – Maarten Aug 20 '15 at 09:14
2

Here is my answer:-

    private string StatisticsFile = @"c:\yourfilename.txt";

    // Read last lines of a file....
    public IList<string> ReadLastLines(int nFromLine, int nNoLines, out bool bMore)
    {
        // Initialise more
        bMore = false;
        try
        {
            char[] buffer = null;
            //lock (strMessages)  Lock something if you need to....
            {
                if (File.Exists(StatisticsFile))
                {
                    // Open file
                    using (StreamReader sr = new StreamReader(StatisticsFile))
                    {
                        long FileLength = sr.BaseStream.Length;

                        int c, linescount = 0;
                        long pos = FileLength - 1;
                        long PreviousReturn = FileLength;
                        // Process file
                        while (pos >= 0 && linescount < nFromLine + nNoLines) // Until found correct place
                        {
                            // Read a character from the end
                            c = BufferedGetCharBackwards(sr, pos);
                            if (c == Convert.ToInt32('\n'))
                            {
                                // Found return character
                                if (++linescount == nFromLine)
                                    // Found last place
                                    PreviousReturn = pos + 1; // Read to here
                            }
                            // Previous char
                            pos--;
                        }
                        pos++;
                        // Create buffer
                        buffer = new char[PreviousReturn - pos];
                        sr.DiscardBufferedData();
                        // Read all our chars
                        sr.BaseStream.Seek(pos, SeekOrigin.Begin);
                        sr.Read(buffer, (int)0, (int)(PreviousReturn - pos));
                        sr.Close();
                        // Store if more lines available
                        if (pos > 0)
                            // Is there more?
                            bMore = true;
                    }
                    if (buffer != null)
                    {
                        // Get data
                        string strResult = new string(buffer);
                        strResult = strResult.Replace("\r", "");

                        // Store in List
                        List<string> strSort = new List<string>(strResult.Split('\n'));
                        // Reverse order
                        strSort.Reverse();

                        return strSort;
                    }
                }
            }
        }
        catch (Exception ex)
        {
            System.Diagnostics.Debug.WriteLine("ReadLastLines Exception:" + ex.ToString());
        }
        // Lets return a list with no entries
        return new List<string>();
    }

    const int CACHE_BUFFER_SIZE = 1024;
    private long ncachestartbuffer = -1;
    private char[] cachebuffer = null;
    // Cache the file....
    private int BufferedGetCharBackwards(StreamReader sr, long iPosFromBegin)
    {
        // Check for error
        if (iPosFromBegin < 0 || iPosFromBegin >= sr.BaseStream.Length)
            return -1;
        // See if we have the character already
        if (ncachestartbuffer >= 0 && ncachestartbuffer <= iPosFromBegin && ncachestartbuffer + cachebuffer.Length > iPosFromBegin)
        {
            return cachebuffer[iPosFromBegin - ncachestartbuffer];
        }
        // Load into cache
        ncachestartbuffer = (int)Math.Max(0, iPosFromBegin - CACHE_BUFFER_SIZE + 1);
        int nLength = (int)Math.Min(CACHE_BUFFER_SIZE, sr.BaseStream.Length - ncachestartbuffer);
        cachebuffer = new char[nLength];
        sr.DiscardBufferedData();
        sr.BaseStream.Seek(ncachestartbuffer, SeekOrigin.Begin);
        sr.Read(cachebuffer, (int)0, (int)nLength);

        return BufferedGetCharBackwards(sr, iPosFromBegin);
    }

Note:-

  1. Call ReadLastLines with nLineFrom starting at 0 for the last line and nNoLines as the number of lines to read back from.
  2. It reverses the list so the 1st one is the last line in the file.
  3. bMore returns true if there are more lines to read.
  4. It caches the data in 1024 char chunks - so it is fast, you may want to increase this size for very large files.

Enjoy!

2

Does your log have lines of similar length? If yes, then you can calculate average length of the line, then do the following:

  1. seek to end_of_file - lines_needed*avg_line_length (previous_point)
  2. read everything up to the end
  3. if you grabbed enough lines, that's fine. If no, seek to previous_point - lines_needed*avg_line_length
  4. read everything up to previous_point
  5. goto 3

memory-mapped file is also a good method -- map the tail of file, calculate lines, map the previous block, calculate lines etc. until you get the number of lines needed

Eugene Mayevski 'Callback
  • 45,135
  • 8
  • 71
  • 121
  • This is a great answer for cases where you only need an approximate number of lines returned. Massively reduces the number of loops and the time taken. Added my implementation of it as an answer. – Ash Jan 13 '21 at 00:26
2

This is in no way optimal but for quick and dirty checks with small log files I've been using something like this:

List<string> mostRecentLines = File.ReadLines(filePath)
    // .Where(....)
    // .Distinct()
    .Reverse()
    .Take(10)
    .ToList()
Ekus
  • 1,679
  • 21
  • 17
0

Something that you can now do very easily in C# 4.0 (and with just a tiny bit of effort in earlier versions) is use memory mapped files for this type of operation. Its ideal for large files because you can map just a portion of the file, then access it as virtual memory.

There is a good example here.

Tim Jarvis
  • 18,465
  • 9
  • 55
  • 92
  • This is a good idea, however as far as I understand it does not allow reading files by lines (text) as question is asking. – AaA Nov 28 '14 at 06:39
0

As @EugeneMayevski stated above, if you just need an approximate number of lines returned, each line has roughly the same line length and you're more concerned with performance especially for large files, this is a better implementation:

    internal static StringBuilder ReadApproxLastNLines(string filePath, int approxLinesToRead, int approxLengthPerLine)
    {
        //If each line is more or less of the same length and you don't really care if you get back exactly the last n
        using (FileStream fs = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
        {
            var totalCharsToRead = approxLengthPerLine * approxLinesToRead;
            var buffer = new byte[1];
             //read approx chars to read backwards from end
            fs.Seek(totalCharsToRead > fs.Length ? -fs.Length : -totalCharsToRead, SeekOrigin.End);
            while (buffer[0] != '\n' && fs.Position > 0)                   //find new line char
            {
                fs.Read(buffer, 0, 1);
            }
            var returnStringBuilder = new StringBuilder();
            using (StreamReader sr = new StreamReader(fs))
            {
                returnStringBuilder.Append(sr.ReadToEnd());
            }
            return returnStringBuilder;
        }
    }
Ash
  • 5,786
  • 5
  • 22
  • 42
0

Most log files have a DateTime stamp. Although can be improved, the code below works well if you want the log messages from the last N days.

    /// <summary>
    /// Returns list of entries from the last N days.
    /// </summary>
    /// <param name="N"></param>
    /// <param name="cSEP">field separator, default is TAB</param>
    /// <param name="indexOfDateColumn">default is 0; change if it is not the first item in each line</param>
    /// <param name="bFileHasHeaderRow"> if true, it will not include the header row</param>
    /// <returns></returns>
    public List<string> ReadMessagesFromLastNDays(int N, char cSEP ='\t', int indexOfDateColumn = 0, bool bFileHasHeaderRow = true)
    {
        List<string> listRet = new List<string>();

        //--- replace msFileName with the name (incl. path if appropriate)
        string[] lines = File.ReadAllLines(msFileName);

        if (lines.Length > 0)
        {
            DateTime dtm = DateTime.Now.AddDays(-N);

            string sCheckDate = GetTimeStamp(dtm);
            //--- process lines in reverse
            int iMin = bFileHasHeaderRow ? 1 : 0;
            for (int i = lines.Length - 1; i >= iMin; i--)  //skip the header in line 0, if any
            {
                if (lines[i].Length > 0)  //skip empty lines
                {
                    string[] s = lines[i].Split(cSEP);
                    //--- s[indexOfDateColumn] contains the DateTime stamp in the log file
                    if (string.Compare(s[indexOfDateColumn], sCheckDate) >= 0)
                    {
                        //--- insert at top of list or they'd be in reverse chronological order
                        listRet.Insert(0, s[1]);    
                    }
                    else
                    {
                        break; //out of loop
                    }
                }
            }
        }

        return listRet;
    }

    /// <summary>
    /// Returns DateTime Stamp as formatted in the log file
    /// </summary>
    /// <param name="dtm">DateTime value</param>
    /// <returns></returns>
    private string GetTimeStamp(DateTime dtm)
    {
        // adjust format string to match what you use
        return dtm.ToString("u");
    }
victorbos
  • 1
  • 1