39

I need to know how to read the last line of a text file. I need to find the line and then process it into a SQL database...
I've been reading around and scouring the web but am battling to find the proper way to do this. I.e.:

  1. Find last line of file.
  2. Process last line of file.
shA.t
  • 16,580
  • 5
  • 54
  • 111
Debbie Dippenaar
  • 521
  • 1
  • 6
  • 9
  • 1
    What have you tried? Note we can only help with "1"... "2" is entirely up to you. – Marc Gravell Jul 24 '12 at 06:54
  • 1
    byte b; fs.Seek(0, SeekOrigin.End); for (long offset = 0; offset < fs.Length; offset++) { fs.Seek(-1, SeekOrigin.Current); b = (byte)fs.ReadByte(): if (b == 10 || b == 13) break; list.Add(b); fs.Seek(-1, SeekOrigin.Current); } list.Reverse(); string lastLine = Encoding.UTF8.GetString(list.ToArray()); – Debbie Dippenaar Jul 24 '12 at 06:57
  • @DebbieDippenaar for extremely large files, that *might not be a bad idea*. Personally I'd probably read *buffers* of bytes at a time, and you'll have some major problems with multi-byte encodings, but... – Marc Gravell Jul 24 '12 at 07:02

6 Answers6

87

There are two ways: simple and inefficient, or horrendously complicated but efficient. The complicated version assumes a sane encoding.

Unless your file is so big that you really can't afford to read it all, I'd just use:

var lastLine = File.ReadLines("file.txt").Last();

Note that this uses File.ReadLines, not File.ReadAllLines. If you're using .NET 3.5 or earlier you'd need to use File.ReadAllLines or write your own code - ReadAllLines will read the whole file into memory in one go, whereas ReadLines streams it.

Otherwise, the complicated way is to use code similar to this. It tries to read backwards from the end of the file, handling nastiness such as UTF-8 multi-byte characters. It's not pleasant.

Community
  • 1
  • 1
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • If you know the encoding of the file you don't really need to handle much nastiness if you're just looking for the last line break which has other characters following it. E.g. U+000D U+000A won't be variable-length in any variable-length encoding such as UTF-8 or UTF-16. And all the rest following it can normally be assumed to be well-formed. – Joey Jul 24 '12 at 07:01
  • @Joey: The tricky bit is that if you seek backwards to an arbitrary place in the file, you need to be aware that you could be mid-character. Not so much a problem for UTF-16, but in UTF-8 you need to align yourself to a character boundary... and you could jump into the "\n" of "\r\n", so you'd have to go backwards again. Basically, have a look at the code in the linked answer. If you can write a simpler version, please do, I'd be really interested to see it. – Jon Skeet Jul 24 '12 at 07:37
  • I must be missing something here. File.ReadLines returns a System.Collections.Generic.IEnumerable object which has no "Last" method when I try to use it. My code is targeted to the .NET 4.0 framework. Am I missing some assembly reference or something else? – dscarr Jun 12 '13 at 13:29
  • 3
    @dscarr: You're missing a `using` directive: `using System.Linq;` – Jon Skeet Jun 12 '13 at 13:40
  • Be aware of performance implications using this on large files. – jjxtra Mar 21 '19 at 01:55
  • @jjxtra: Again, as already noted in the answer: "Unless your file is so big that you really can't afford to read it all" and "There are two ways: simple and inefficient, or horrendously complicated but efficient" – Jon Skeet Mar 21 '19 at 07:16
9

I would simply combine File.ReadLines(path) and Enumerable.Last:

String last = File.ReadLines(@"C:\file.txt").Last();

It streams the lines and does not load all into memory as File.ReadAllLines.

Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
9

First part:

File.ReadAllLines(@"c:\some\path\file.txt").Last();

or

File.ReadLines(@"c:\some\path\file.txt").Last();

ReadLines is prefered.

juFo
  • 17,849
  • 10
  • 105
  • 142
6
string m = "";
StreamReader r = new StreamReader("file_path");
while (r.EndOfStream == false)
{
    m = r.ReadLine();
}
Console.WriteLine("{0}\n", m);
r.Close();
jgb
  • 1,206
  • 5
  • 18
  • 28
krish
  • 77
  • 1
  • 1
6

Note: All this code assumes UTF-8. If you need to support a code page that uses double wide chars like Unicode, then you need to add extra checks to the char before and/or after the newline to make sure it is really a newline.

One of the primary use cases of this question is scraping the end of a log file. The other answers unfortunately die a horrible death when log files get into the megabytes. Imagine running every line on every call on a tiny single core VPS... yikes.

The nice thing about UTF-8 is that when you hit a '\n' character, you don't have to worry about any dependent bytes because any byte with the high bit clear in UTF8-8 is simply an ASCII character. Pretty handy!

You could use the solution at 'How to read a text file reversely with iterator in C#', but be aware that the code is fairly complex. If you just need a simple UTF-8 line trailer, this solution will work very well and perform great even on large log files.

If you are monitoring lots of files at once and are using something like a FileSystemWatcher in C#, this performance gain will be very important. I am using very similar code on a cheap single cpu Linux VPS to monitor login failures and put ip addresses in the firewall in my MIT licensed project https://github.com/DigitalRuby/IPBan, using https://github.com/DigitalRuby/IPBan/blob/master/IPBanCore/Core/Utility/LogFileScanner.cs (which handles multiple new lines at once).

You'd be suprised at how large auth.log can get when your SSH port is public facing. If you are reading dozens or even hundreds of files regularly, you'll be really glad you didn't use File.ReadAllLines().Last();

As this is only a page of code, it's a nice balance between simple and very fast.

C# Code ...

/// <summary>
/// Utility class to read last line from a utf-8 text file in a performance sensitive way. The code does not handle a case where more than one line is written at once.
/// </summary>
public static class UTF8FileUtilities
{
    /// <summary>
    /// Read the last line from the file. This method assumes that each write to the file will be terminated with a new line char ('\n')
    /// </summary>
    /// <param name="path">Path of the file to read</param>
    /// <returns>The last line or null if a line could not be read (empty file or partial line write in progress)</returns>
    /// <exception cref="Exception">Opening or reading from file fails</exception>
    public static string ReadLastLine(string path)
    {
        // open read only, we don't want any chance of writing data
        using (System.IO.Stream fs = System.IO.File.OpenRead(path))
        {
            // check for empty file
            if (fs.Length == 0)
            {
                return null;
            }

            // start at end of file
            fs.Position = fs.Length - 1;

            // the file must end with a '\n' char, if not a partial line write is in progress
            int byteFromFile = fs.ReadByte();
            if (byteFromFile != '\n')
            {
                // partial line write in progress, do not return the line yet
                return null;
            }

            // move back to the new line byte - the loop will decrement position again to get to the byte before it
            fs.Position--;

            // while we have not yet reached start of file, read bytes backwards until '\n' byte is hit
            while (fs.Position > 0)
            {
                fs.Position--;
                byteFromFile = fs.ReadByte();
                if (byteFromFile < 0)
                {
                    // the only way this should happen is if someone truncates the file out from underneath us while we are reading backwards
                    throw new System.IO.IOException("Error reading from file at " + path);
                }
                else if (byteFromFile == '\n')
                {
                    // we found the new line, break out, fs.Position is one after the '\n' char
                    break;
                }
                fs.Position--;
            }

            // fs.Position will be right after the '\n' char or position 0 if no '\n' char
            byte[] bytes = new System.IO.BinaryReader(fs).ReadBytes((int)(fs.Length - fs.Position));
            return System.Text.Encoding.UTF8.GetString(bytes);
        }
    }
}
jjxtra
  • 20,415
  • 16
  • 100
  • 140
  • I think you missed copying the relevant part of the file to your `bytes` buffer. https://stackoverflow.com/a/24412022/343340 – DrummerB Jul 30 '18 at 12:51
  • Good catch, added. – jjxtra Jul 30 '18 at 16:19
  • "Regardless of code page, \n is most likely to always represent a newline." That may not be the whole of the character though. You're assuming that the byte after that is the start of the last line, which may not be the case. For example, using `Encoding.Unicode`, "\n" is represented as 0x0a, 0x00. You'd end up reading from half way through the character. (You're also assuming that a single call to `Stream.Read` will read the rest of the file in one call, which isn't generally a good assumption to make.) – Jon Skeet Mar 19 '19 at 07:14
  • In other words, you've sacrificed correctness for efficiency *without even knowing if efficiency is important*. In many cases, efficiency really doesn't matter too much, so long as you know up front what your criteria are. Correctness nearly *always* matters though. – Jon Skeet Mar 19 '19 at 07:17
  • @JonSkeet One of the biggest use cases and why I shared the function, is log file trailing. The other answers simply are unusable for log files in 10-100MB+ or more, especially on non-ssd drives or tiny cheap VPS with single CPU. I know, I tried the naive way and had horrible CPU spikes and IO usage. Yes, there is a link to a much more complex bit of code, but I think this solution balances performance and simplicity well. I appreciate your enthusiasm for 'premature optimization is the root of all evil...' but in this case I think it is really important. – jjxtra Mar 21 '19 at 02:42
  • @jjxtra: For *that* use case, sure. And I'm glad you removed "even for small or medium size files" - but you've **assumed** that's the use case that all readers want, with no evidence. You've warned against the complexity of more general code, but failed to mention that for cases where performance isn't important (which it may not be even for a 100MB file) the code you've presented is still *much* more complex than the simple one-liners presented. Shouldn't you at least acknowledge that performance *isn't* always important? – Jon Skeet Mar 21 '19 at 07:14
  • 1
    Of course it always isn’t. But the primary reason to read the last line of a file is to do it often. In this case as the file grows the other answers get slower and slower eating cpu and disk un-necessarily. Or imagine trailing 10 or 100 files. Now things get hairy and those that copied and pasted the top answer are scratching their heads why their server is bogged down. Performance isn’t always top priority but it is important to recognize a perf critical situation which this is one. As I mentioned even my cheap Linux vps had trouble reading all the lines from a 1mb file once a second. – jjxtra Mar 21 '19 at 14:48
  • And while it could read those lines once a second it was taking cpu away from the smtp and websites on the server. Thanks for the discussion. – jjxtra Mar 21 '19 at 14:49
-2
string last = File.ReadLines(@"C:\file.txt").Last();
string lastsymbol = last[last.Count - 1];
Petter Friberg
  • 21,252
  • 9
  • 60
  • 109
luuk
  • 27