-4

Hi I was searching for the difference File.ReadLines() and File.ReadAllLines() and found the question below.

What is the difference between File.ReadLines() and File.ReadAllLines()?

Isn't it the same memory usage and performance result if we use ToArray(); at the end of ReadLines ?

U.Deniz A.
  • 13
  • 6
  • 1
    The answer is very probably "yes". But why don't you just try it yourself? – Klaus Gütter Jul 26 '22 at 05:28
  • There are more factors to evaluate besides memory usage and performance. Readability is one of them. That being said their intended usage is different and if your solution can use `File.ReadLines()` without calling `.ToArray()` in large files it should perform better. Check John's answer for a more elaborate explanation. – Cleptus Jul 26 '22 at 05:54

1 Answers1

1

It's pointless calling ReadLines if you are going to call ToArray. It probably even performs worse than ReadAllLines. The point of ReadLines is that it reads one line at a time and exposes that line for processing, so you never have to hold the entire file contents in memory at the same time and you may be able to halt processing without reading the entire file. Imagine searching a file of a million lines for some text that is in the first line. ReadLines would enable you to read that first line and then stop, while ReadAllLines would make you wait until all lines had been read into memory and then you'd only use the first one.

For small files, it doesn't really make much difference but it is more correct to use ReadLines if you are using the data sequentially and only once. For big files, it can make a significant difference so you should definitely use ReadLines unless you specifically need all the data at the same time, e.g. for random and/or multiple access.

John
  • 3,057
  • 1
  • 4
  • 10
  • "_It probably even performs worse than ReadAllLines_" I haven't located the actual implementation of that specific `ToArray()`. The one I have found is [List.ToArray()](https://referencesource.microsoft.com/#mscorlib/system/collections/generic/list.cs,cf7f4095e4de7646) and that could perform worse because of the allocation of a large array in memory and data copying into the array. – Cleptus Jul 26 '22 at 06:01
  • @Cleptus, I started out saying "may" and then changed that to "probably" because I would expect that there is some overhead involved. It is often the case that things designed to provide better performance in most cases will involve an overhead that results in worse performance in the most basic cases. It think that it's safe to say that the best case scenario is that it would produce the same performance. – John Jul 26 '22 at 06:06
  • @Cleptus, `ReadAllLines` uses a `StreamReader` and reads lines into a `List` and then calls `ToArray` on that, which you may already know. `System.Linq.Enumerable.ToArray` is [here](https://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,783a052330e7d48d,references). – John Jul 26 '22 at 06:16
  • So it boils down to the efficency comments in the [ReadLines(String) documentation](https://learn.microsoft.com/en-us/dotnet/api/system.io.file.readlines?redirectedfrom=MSDN&view=net-6.0#System_IO_File_ReadLines_System_String_) being voided because of the usage OP suggests `File.ReadLines().ToArray()` – Cleptus Jul 26 '22 at 06:30
  • @ John But how can I access the specific line if I use ReadLines. For example if I want to store the line number of some specific line how can I do that since ReadLines doesn't store index of the line (line number). Or if I don't want to read all lines in txt but only some specific numbered lines for example I want to read (or print) 345. lines to 350. lines. I believe I can not do that with Readlines since it doesn't know line number (row number). I think there is only foreach option to read the file so it reads all of the file. Can you help me on that – U.Deniz A. Jul 28 '22 at 07:49