What is the most efficient way to display the last 10 lines of a very large text file (this particular file is over 10GB). I was thinking of just writing a simple C# app but I'm not sure how to do this effectively.
-
“Effectively”? What exactly do you mean? Fast execution? Small memory footprint? – Bombe Dec 29 '08 at 19:19
-
8all of the above? :D – DV. Dec 29 '08 at 19:21
-
fast execution is top priority. thanks! – Chris Conway Dec 29 '08 at 20:27
21 Answers
Read to the end of the file, then seek backwards until you find ten newlines, and then read forward to the end taking into consideration various encodings. Be sure to handle cases where the number of lines in the file is less than ten. Below is an implementation (in C# as you tagged this), generalized to find the last numberOfTokens
in the file located at path
encoded in encoding
where the token separator is represented by tokenSeparator
; the result is returned as a string
(this could be improved by returning an IEnumerable<string>
that enumerates the tokens).
public static string ReadEndTokens(string path, Int64 numberOfTokens, Encoding encoding, string tokenSeparator) {
int sizeOfChar = encoding.GetByteCount("\n");
byte[] buffer = encoding.GetBytes(tokenSeparator);
using (FileStream fs = new FileStream(path, FileMode.Open)) {
Int64 tokenCount = 0;
Int64 endPosition = fs.Length / sizeOfChar;
for (Int64 position = sizeOfChar; position < endPosition; position += sizeOfChar) {
fs.Seek(-position, SeekOrigin.End);
fs.Read(buffer, 0, buffer.Length);
if (encoding.GetString(buffer) == tokenSeparator) {
tokenCount++;
if (tokenCount == numberOfTokens) {
byte[] returnBuffer = new byte[fs.Length - fs.Position];
fs.Read(returnBuffer, 0, returnBuffer.Length);
return encoding.GetString(returnBuffer);
}
}
}
// handle case where number of tokens in file is less than numberOfTokens
fs.Seek(0, SeekOrigin.Begin);
buffer = new byte[fs.Length];
fs.Read(buffer, 0, buffer.Length);
return encoding.GetString(buffer);
}
}

- 236,483
- 35
- 423
- 525
-
17That assumes an encoding where the size of the character is always the same. It could get tricky in other encodings. – Jon Skeet Dec 29 '08 at 20:31
-
3And, as Skeet informed me once, the Read method is not guaranteed to read the requested number of bytes. You have to check the return value to determine if you're done reading... – Dec 29 '08 at 20:52
-
2
-
1@Will: There are several places where error checking should be added to the code. Thank you, though, for reminding me of one of the nasty facts about Stream.Read. – jason Dec 30 '08 at 02:45
-
3I've noticed this procedure is quite timely when executed on a file ~4MB. Any suggested improvements? Or other C# examples on tailing files? – GONeale Mar 02 '09 at 05:01
-
This is quite a sweet implementation. I modified slightly for a log file I want to read without locking (log is in constant use by another application): `new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)`. And in order to wait until the file is available, I wrapped the whole lot in: `while (true) { try { ... } catch (IOException) { Thread.Sleep(500); } }` – grenade Feb 28 '14 at 14:46
-
@Jason Best get used to variable length encoding... UTF-8 is not going away. – Stijn de Witt Aug 10 '14 at 20:29
-
@jason Old post but it looks like it is what I need, but can you give me an example usage? If `numberOfTokes` are the number of lines, is `tokenSeperator` then `\n` ? – YvesR Dec 09 '16 at 16:16
-
Yes, numberOfTokens is the number of N last lines and tokenSeparator is \n or \r\n depending on the file content. I changed the code to if (tokenCount++ == numberOfTokens) in order to compare tokenCount to numberOfTokens before incrementing. This results in the correct number of returned lines. – thomasgalliker May 09 '23 at 10:21
I'd likely just open it as a binary stream, seek to the end, then back up looking for line breaks. Back up 10 (or 11 depending on that last line) to find your 10 lines, then just read to the end and use Encoding.GetString on what you read to get it into a string format. Split as desired.

- 66,480
- 18
- 94
- 155
Tail? Tail is a unix command that will display the last few lines of a file. There is a Windows version in the Windows 2003 Server resource kit.

- 967
- 5
- 8
As the others have suggested, you can go to the end of the file and read backwards, effectively. However, it's slightly tricky - particularly because if you have a variable-length encoding (such as UTF-8) you need to be cunning about making sure you get "whole" characters.

- 1,421,763
- 867
- 9,128
- 9,194
-
hm? `\r` and `\n` are single bytes in UTF-8. There might be issues, but only with weird legacy encodings. – CodesInChaos Jun 06 '13 at 23:05
-
3@CodesInChaos: I didn't say that `\r` and `\n` weren't single bytes... but *other* characters take more bytes (anything over U+0080) so you need to take account of that - if you seek to some arbitrary point in the file, you may be "mid-character" and have to account for that. UTF-8 makes it feasible (but not easy) as you can always *tell* when you're mid-characters... but other encodings may not. I've written code to read a file backwards, and it's a painful business. – Jon Skeet Jun 07 '13 at 05:48
You should be able to use FileStream.Seek() to move to the end of the file, then work your way backwards, looking for \n until you have enough lines.

- 2,101
- 14
- 20
I'm not sure how efficient it will be, but in Windows PowerShell getting the last ten lines of a file is as easy as
Get-Content file.txt | Select-Object -last 10

- 10,119
- 15
- 48
- 51
-
Beginning with PowerShell v5, the Get-Content command supports the `-Tail` parameter which *does not* have the performance problem that this method does. This should be `Get-Content file.txt -Tail 10`. Additionally, you can specify the `-Wait` parameter to output updates to the file as they are being made, similar to `tail -f`. So `Get-Content file -Tail 10 -Wait` will output the last 10 lines of the file, and then wait and append new lines subsequently added to the file later. – Bacon Bits Nov 08 '18 at 19:42
That is what unix tail command does. See http://en.wikipedia.org/wiki/Tail_(Unix)
There is lots of open source implementations on internet and here is one for win32: Tail for WIn32

- 13,384
- 14
- 59
- 75
I think the following code will solve the prblem with subtle changes regrading encoding
StreamReader reader = new StreamReader(@"c:\test.txt"); //pick appropriate Encoding
reader.BaseStream.Seek(0, SeekOrigin.End);
int count = 0;
while ((count < 10) && (reader.BaseStream.Position > 0))
{
reader.BaseStream.Position--;
int c = reader.BaseStream.ReadByte();
if (reader.BaseStream.Position > 0)
reader.BaseStream.Position--;
if (c == Convert.ToInt32('\n'))
{
++count;
}
}
string str = reader.ReadToEnd();
string[] arr = str.Replace("\r", "").Split('\n');
reader.Close();

- 225
- 2
- 12

- 7,148
- 12
- 57
- 96
-
1Something with a brief bit of testing, change reader.Read() to reader.BaseStream.ReadByte(), while should check that Position>0, and 2nd Position-- should check if Position>0. Finally, at the very end, every newline is "\r\n" not just '\n', so change Split('\n') to Replace("\r", "").Split('\n'). It needed some fine tuning, but if you have the time to complain "does not work," instead figure out what's wrong and actually critique it. – Peter Lacerenza Oct 30 '12 at 15:03
You could use the windows version of the tail command and just pype it's output to a text file with the > symbol or view it on the screen depending on what your needs are.

- 39,513
- 29
- 110
- 145
-
I think that is somewhat what Eric Ness said. But sometimes I really do like the Linux commands - optimised for text manipulation on command line, no, sorry, terminal... – Anthony Horne Jun 03 '15 at 13:19
here is version of mine. HTH
using (StreamReader sr = new StreamReader(path))
{
sr.BaseStream.Seek(0, SeekOrigin.End);
int c;
int count = 0;
long pos = -1;
while(count < 10)
{
sr.BaseStream.Seek(pos, SeekOrigin.End);
c = sr.Read();
sr.DiscardBufferedData();
if(c == Convert.ToInt32('\n'))
++count;
--pos;
}
sr.BaseStream.Seek(pos, SeekOrigin.End);
string str = sr.ReadToEnd();
string[] arr = str.Split('\n');
}

- 2,862
- 1
- 24
- 13
-
If your file is less than 10 lines your code will crash. Use this while-sentence instead `while (count < 10 && -pos < sr.BaseStream.Length)` – Jesper Jun 25 '20 at 11:56
Using Sisutil's answer as a starting point, you could read the file line by line and load them into a Queue<String>
. It does read the file from the start, but it has the virtue of not trying to read the file backwards. This can be really difficult if you have a file with a variable character width encoding like UTF-8 as Jon Skeet pointed out. It also doesn't make any assumptions about line length.
I tested this against a 1.7GB file (didn't have a 10GB one handy) and it took about 14 seconds. Of course, the usual caveats apply when comparing load and read times between computers.
int numberOfLines = 10;
string fullFilePath = @"C:\Your\Large\File\BigFile.txt";
var queue = new Queue<string>(numberOfLines);
using (FileStream fs = File.Open(fullFilePath, FileMode.Open, FileAccess.Read, FileShare.Read))
using (BufferedStream bs = new BufferedStream(fs)) // May not make much difference.
using (StreamReader sr = new StreamReader(bs)) {
while (!sr.EndOfStream) {
if (queue.Count == numberOfLines) {
queue.Dequeue();
}
queue.Enqueue(sr.ReadLine());
}
}
// The queue now has our set of lines. So print to console, save to another file, etc.
do {
Console.WriteLine(queue.Dequeue());
} while (queue.Count > 0);

- 858
- 8
- 12
I just had the same Problem, a huge log file that should be accessed via a REST interface. Of course loading it into whatever memory and sending it complete via http was no solution.
As Jon pointed out, this Solution has a very specific usecase. In my case, I know for sure (and check), that the encoding is utf-8 (with BOM!) and thus can profit from all the blessings of UTF. It is surely not a general purpose solution.
Here is what worked for me extremely well and fast (I forgot to close the stream - fixed now):
private string tail(StreamReader streamReader, long numberOfBytesFromEnd)
{
Stream stream = streamReader.BaseStream;
long length = streamReader.BaseStream.Length;
if (length < numberOfBytesFromEnd)
numberOfBytesFromEnd = length;
stream.Seek(numberOfBytesFromEnd * -1, SeekOrigin.End);
int LF = '\n';
int CR = '\r';
bool found = false;
while (!found) {
int c = stream.ReadByte();
if (c == LF)
found = true;
}
string readToEnd = streamReader.ReadToEnd();
streamReader.Close();
return readToEnd;
}
We first seek to somewhere near the end with the BaseStream, and when we have the right stream positon, read to the end with the usual StreamReader.
This doesn't really allow to specify the amount of lines form the end, which is not a good idea anyways, as the lines could be arbitrarily long and thus, killing the performance again. So I specify the amount of bytes, read until we get to the first Newline and the comfortably read to the end. Theoretically, you could also look for the CarriageReturn also, but in my case, that was not necessary.
If we use this code, it will not disturb a writer thread:
FileStream fileStream = new FileStream(
filename,
FileMode.Open,
FileAccess.Read,
FileShare.ReadWrite);
StreamReader streamReader = new StreamReader(fileStream);

- 2,664
- 2
- 27
- 38
-
Note that this assumes that `'\n'` will appear as a single byte for the character, and that it can't appear in any other way. That's okay for some encodings, but certainly not all. Also, loading "some number of lines" (possibly 0) from the end may be fine for you, but it's not really what was being asked in the question. Finally, you should probably call `streamReader.DiscardBufferedData()` so that if it *has* buffered anything, it doesn't use that information on the next read call, and instead consults the stream. – Jon Skeet Nov 25 '15 at 06:49
-
Thanks for the comment and let me say, I am totally geeking out right now: My first comment from Jon Skeet hinself :-) – Xan-Kun Clark-Davis Nov 25 '15 at 18:14
-
I edited the answer and hope it is better that way. In my case the answer should be transfered via http and presented in a browser. So I don't really wanted to use line numbers, as a lot of long lines can change the whole situation quickly. By specifiying the amount of bytes, I can always guarantee that the answer is quick. And oh boy is this fast. I am going to do some testing (after the actual work :-) ) because I am really curious. It seems to outperform all other solutions, but that is a little far fetched. I wonder what the OS is really doing with this... Thanks for making my day ☃ – Xan-Kun Clark-Davis Nov 25 '15 at 18:21
If you open the file with FileMode.Append it will seek to the end of the file for you. Then you could seek back the number of bytes you want and read them. It might not be fast though regardless of what you do since that's a pretty massive file.

- 3,336
- 3
- 26
- 34
One useful method is FileInfo.Length
. It gives the size of a file in bytes.
What structure is your file? Are you sure the last 10 lines will be near the end of the file? If you have a file with 12 lines of text and 10GB of 0s, then looking at the end won't really be that fast. Then again, you might have to look through the whole file.
If you are sure that the file contains numerous short strings each on a new line, seek to the end, then check back until you've counted 11 end of lines. Then you can read forward for the next 10 lines.

- 4,629
- 2
- 25
- 28
I think the other posters have all shown that there is no real shortcut.
You can either use a tool such as tail (or powershell) or you can write some dumb code that seeks end of file and then looks back for n newlines.
There are plenty of implementations of tail out there on the web - take a look at the source code to see how they do it. Tail is pretty efficient (even on very very large files) and so they must have got it right when they wrote it!

- 12,702
- 4
- 31
- 54
In case you need to read any number of lines in reverse from a text file, here's a LINQ-compatible class you can use. It focuses on performance and support for large files. You could read several lines and call Reverse() to get the last several lines in forward order:
Usage:
var reader = new ReverseTextReader(@"C:\Temp\ReverseTest.txt");
while (!reader.EndOfStream)
Console.WriteLine(reader.ReadLine());
ReverseTextReader Class:
/// <summary>
/// Reads a text file backwards, line-by-line.
/// </summary>
/// <remarks>This class uses file seeking to read a text file of any size in reverse order. This
/// is useful for needs such as reading a log file newest-entries first.</remarks>
public sealed class ReverseTextReader : IEnumerable<string>
{
private const int BufferSize = 16384; // The number of bytes read from the uderlying stream.
private readonly Stream _stream; // Stores the stream feeding data into this reader
private readonly Encoding _encoding; // Stores the encoding used to process the file
private byte[] _leftoverBuffer; // Stores the leftover partial line after processing a buffer
private readonly Queue<string> _lines; // Stores the lines parsed from the buffer
#region Constructors
/// <summary>
/// Creates a reader for the specified file.
/// </summary>
/// <param name="filePath"></param>
public ReverseTextReader(string filePath)
: this(new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.Read), Encoding.Default)
{ }
/// <summary>
/// Creates a reader using the specified stream.
/// </summary>
/// <param name="stream"></param>
public ReverseTextReader(Stream stream)
: this(stream, Encoding.Default)
{ }
/// <summary>
/// Creates a reader using the specified path and encoding.
/// </summary>
/// <param name="filePath"></param>
/// <param name="encoding"></param>
public ReverseTextReader(string filePath, Encoding encoding)
: this(new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.Read), encoding)
{ }
/// <summary>
/// Creates a reader using the specified stream and encoding.
/// </summary>
/// <param name="stream"></param>
/// <param name="encoding"></param>
public ReverseTextReader(Stream stream, Encoding encoding)
{
_stream = stream;
_encoding = encoding;
_lines = new Queue<string>(128);
// The stream needs to support seeking for this to work
if(!_stream.CanSeek)
throw new InvalidOperationException("The specified stream needs to support seeking to be read backwards.");
if (!_stream.CanRead)
throw new InvalidOperationException("The specified stream needs to support reading to be read backwards.");
// Set the current position to the end of the file
_stream.Position = _stream.Length;
_leftoverBuffer = new byte[0];
}
#endregion
#region Overrides
/// <summary>
/// Reads the next previous line from the underlying stream.
/// </summary>
/// <returns></returns>
public string ReadLine()
{
// Are there lines left to read? If so, return the next one
if (_lines.Count != 0) return _lines.Dequeue();
// Are we at the beginning of the stream? If so, we're done
if (_stream.Position == 0) return null;
#region Read and Process the Next Chunk
// Remember the current position
var currentPosition = _stream.Position;
var newPosition = currentPosition - BufferSize;
// Are we before the beginning of the stream?
if (newPosition < 0) newPosition = 0;
// Calculate the buffer size to read
var count = (int)(currentPosition - newPosition);
// Set the new position
_stream.Position = newPosition;
// Make a new buffer but append the previous leftovers
var buffer = new byte[count + _leftoverBuffer.Length];
// Read the next buffer
_stream.Read(buffer, 0, count);
// Move the position of the stream back
_stream.Position = newPosition;
// And copy in the leftovers from the last buffer
if (_leftoverBuffer.Length != 0)
Array.Copy(_leftoverBuffer, 0, buffer, count, _leftoverBuffer.Length);
// Look for CrLf delimiters
var end = buffer.Length - 1;
var start = buffer.Length - 2;
// Search backwards for a line feed
while (start >= 0)
{
// Is it a line feed?
if (buffer[start] == 10)
{
// Yes. Extract a line and queue it (but exclude the \r\n)
_lines.Enqueue(_encoding.GetString(buffer, start + 1, end - start - 2));
// And reset the end
end = start;
}
// Move to the previous character
start--;
}
// What's left over is a portion of a line. Save it for later.
_leftoverBuffer = new byte[end + 1];
Array.Copy(buffer, 0, _leftoverBuffer, 0, end + 1);
// Are we at the beginning of the stream?
if (_stream.Position == 0)
// Yes. Add the last line.
_lines.Enqueue(_encoding.GetString(_leftoverBuffer, 0, end - 1));
#endregion
// If we have something in the queue, return it
return _lines.Count == 0 ? null : _lines.Dequeue();
}
#endregion
#region IEnumerator<string> Interface
public IEnumerator<string> GetEnumerator()
{
string line;
// So long as the next line isn't null...
while ((line = ReadLine()) != null)
// Read and return it.
yield return line;
}
IEnumerator IEnumerable.GetEnumerator()
{
throw new NotImplementedException();
}
#endregion
}

- 126
- 1
- 3
Using PowerShell, Get-Content big_file_name.txt -Tail 10
where 10 is the number of bottom lines to retrieve.
This has no performance problems. I ran it on a text file that is over 100 GB and got an instant result.

- 329
- 4
- 13
If you have a file that has a even format per line (such as a daq system), you just use streamreader to get the length of the file, then take one of the lines, (readline()
).
Divide the total length by the length of the string. Now you have a general long number to represent the number of lines in the file.
The key is that you use the readline()
prior to getting your data for your array or whatever. This is will ensure that you will start at the beginning of a new line, and not get any leftover data from the previous one.
StreamReader leader = new StreamReader(GetReadFile);
leader.BaseStream.Position = 0;
StreamReader follower = new StreamReader(GetReadFile);
int count = 0;
string tmper = null;
while (count <= 12)
{
tmper = leader.ReadLine();
count++;
}
long total = follower.BaseStream.Length; // get total length of file
long step = tmper.Length; // get length of 1 line
long size = total / step; // divide to get number of lines
long go = step * (size - 12); // get the bit location
long cut = follower.BaseStream.Seek(go, SeekOrigin.Begin); // Go to that location
follower.BaseStream.Position = go;
string led = null;
string[] lead = null ;
List<string[]> samples = new List<string[]>();
follower.ReadLine();
while (!follower.EndOfStream)
{
led = follower.ReadLine();
lead = Tokenize(led);
samples.Add(lead);
}
Open the file and start reading lines. After you've read 10 lines open another pointer, starting at the front of the file, so the second pointer lags the first by 10 lines. Keep reading, moving the two pointers in unison, until the first reaches the end of the file. Then use the second pointer to read the result. It works with any size file including empty and shorter than the tail length. And it's easy to adjust for any length of tail. The drawback, of course, is that you end up reading the entire file and that may be exactly what you're trying to avoid.

- 4,915
- 8
- 41
- 54
-
1if the file is 10GB, I think its safe to say that's exactly what he's trying to avoid :-) – gbjbaanb Dec 29 '08 at 22:15
I used this code for a small utility sometime ago, i hope it can help you!
private string ReadRows(int offset) /*offset: how many lines it reads from the end (10 in your case)*/
{
/*no lines to read*/
if (offset == 0)
return result;
using (FileStream fs = new FileStream(FullName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 2048, true))
{
List<char> charBuilder = new List<char>(); /*StringBuilder doesn't work with Encoding: example char */
StringBuilder sb = new StringBuilder();
int count = 0;
/*tested with utf8 file encoded by notepad-pp; other encoding may not work*/
var decoder = ReaderEncoding.GetDecoder();
byte[] buffer;
int bufferLength;
fs.Seek(0, SeekOrigin.End);
while (true)
{
bufferLength = 1;
buffer = new byte[1];
/*for encoding with variable byte size, every time I read a byte that is part of the character and not an entire character the decoder returns '�' (invalid character) */
char[] chars = { '�' }; //� 65533
int iteration = 0;
while (chars.Contains('�'))
{
/*at every iteration that does not produce character, buffer get bigger, up to 4 byte*/
if (iteration > 0)
{
bufferLength = buffer.Length + 1;
byte[] newBuffer = new byte[bufferLength];
Array.Copy(buffer, newBuffer, bufferLength - 1);
buffer = newBuffer;
}
/*there are no characters with more than 4 bytes in utf-8*/
if (iteration > 4)
throw new Exception();
/*if all is ok, the last seek return IOError with chars = empty*/
try
{
fs.Seek(-(bufferLength), SeekOrigin.Current);
}
catch
{
chars = new char[] { '\0' };
break;
}
fs.Read(buffer, 0, bufferLength);
var charCount = decoder.GetCharCount(buffer, 0, bufferLength);
chars = new char[charCount];
decoder.GetChars(buffer, 0, bufferLength, chars, 0);
++iteration;
}
/*when i get a char*/
charBuilder.InsertRange(0, chars);
if (chars.Length > 0 && chars[0] == '\n')
++count;
/*exit when i get the correctly number of line (*last row is in interval)*/
if (count == offset + 1)
break;
/*the first search goes back, the reading goes on then we come back again, except the last */
try
{
fs.Seek(-(bufferLength), SeekOrigin.Current);
}
catch (Exception)
{
break;
}
}
}
/*everithing must be reversed, but not \0*/
charBuilder.RemoveAt(0);
/*yuppi!*/
return new string(charBuilder.ToArray());
}
I attach a screen for the speed

- 41
- 1
- 5
Why not use file.readalllines which returns a string[]?
Then you can get the last 10 lines (or members of the array) which would be a trivial task.
This approach isn't taking into account any encoding issues and I'm not sure on the exact efficiency of this approach (time taken to complete method, etc).

- 65,107
- 109
- 251
- 387
-
1Do read the question before giving an answer! This approach will take FAR too much time. – dnclem Dec 23 '11 at 15:14