So you're processing a text file, meaning you need to read all text, and want to preserve any newline characters, even at the end of the file.
You've correctly concluded that ReadLine()
eats those, even if the file doesn't end with one. In fact, ReadLine()
eats the last carriage return when a file ends with a one (StreamReader.EndOfStream
is true
after reading the penultimate line). ReadAllText()
also eats the last newline. Given you're potentially dealing with large files, you also don't want to read the entire file in memory at once.
You also can't just compare the last two bytes of the file, because there are encodings that use more than one byte to encode a character, such as UTF-16. So you'll need to read the file being encoding-aware. A StreamReader does just that.
So a solution would be to create your own version of ReadLine()
, which includes the newline character(s) at the end:
public static class StreamReaderExtensions
{
public static string ReadLineWithNewLine(this StreamReader reader)
{
var builder = new StringBuilder();
while (!reader.EndOfStream)
{
int c = reader.Read();
builder.Append((char) c);
if (c == 10)
{
break;
}
}
return builder.ToString();
}
}
Then you can check the last returned line whether it ends in \n
:
string line = "";
using (var stream = new StreamReader(@"D:\Temp\NewlineAtEnd.txt"))
{
while (!stream.EndOfStream)
{
line = stream.ReadLineWithNewLine();
Console.Write(line);
}
}
Console.WriteLine();
if (line.EndsWith("\n"))
{
Console.WriteLine("Newline at end of file");
}
else
{
Console.WriteLine("No newline at end of file");
}
Though the StreamReader
is heavily optimized, I can't vouch for the performance of reading one character at a time. A quick test using two equal 100 MB text files showed a quite drastic slowdown compared to ReadLine()
(~1800 vs ~400 ms).
This approach does preserve the original line endings though, meaning you can safely rewrite a file using strings returned by this extension method, without changing all \n
to \r\n
or vice versa.