4

I've been working on a PowerShell script using Streamreader (and StreamWriter) to parse a large file into smaller reports. While doing some searching around on the best way to put this thing together, I've found there are two methods largely used to read content to the end of the file.

1 - while ($reader.Peek() -ge 0) { $line = $reader.Readline() ... }

2 - while (($line = $read.ReadLine()) -ne $null) { do stuff ... }

From the documentation, it looks like Peek will read the next value, but not change the position of the reader. It looks like ReadLine will essentially do the same, but read a whole string/line. I feel like this is a "no-duh" question - is it really more efficient to actually peek at a value before reading the line, or is it just an extra step before assigning the reader to a variable?

Thank you in advance!

briantist
  • 45,546
  • 6
  • 82
  • 127
LakeWater
  • 43
  • 3

1 Answers1

3

Since you need lines anyway, I see no reason to Peek(). If you really want to check whether you're at the end, then the .EndOfStream property is likely to be more accurate anyway.

As discussed here, .Peek() can return -1 when errors occur as well, not just when the end of stream is reached. Most answers there also recommend avoiding it and just using .ReadLine().

mklement0 also mentioned using System.IO.File.ReadLines. This returns an enumerable so you can just call it with a path and use it like other enumerables, without loading all the lines at once (so it still works with large files).

You could use it with foreach or with ForEach-Object, for example:

foreach ($line in ([System.IO.File]::ReadLines('path\to\file'))) {
    $line
}


[System.IO.File]::ReadLines('path\to\file') | ForEach-Object -Process {
    $_
}

$reader = [System.IO.File]::ReadLines('path\to\file')

foreach ($line in $reader) { $line }
$reader | ForEach-Object -Process { $_ }
briantist
  • 45,546
  • 6
  • 82
  • 127
  • 1
    Nice, but note that do you do need the `$null` check to avoid exiting the loop on encountering the first _empty_ line. Perhaps worth mentioning [System.IO.File.ReadLines](https://learn.microsoft.com/en-US/dotnet/api/System.IO.File.ReadLines) as a more convenient alternative. – mklement0 Dec 30 '19 at 18:20
  • Thank you so much for the feedback. Somewhat on topic, I was discussing this with a coworker as well and he mentioned `$reader = [IO.File]::OpenText('path\to\file')` as a method of loading the file too. Similar to the `.Peek()` method, it seems like it's an extra step in comparison to `[System.IO.File]::ReadLines` though. However, if I understand [this post](https://stackoverflow.com/questions/34916237/opentext-vs-readlines) correctly, it's maybe a little more efficient at parsing when including parameters/patterns. – LakeWater Dec 31 '19 at 14:43
  • @LakeWater I read the question and answers on the post you linked but I'm not sure what you mean regarding parameters/patterns. There's not much "efficiency" to be gained by doing the loop iteration and file handle operations yourself when you don't need the control it provides. `.ReadLines` gives you a simple language construct (enumerable) that's highly efficient and handles all file operations. That gives you a significant advantage in terms of maintainability and clarity. – briantist Dec 31 '19 at 17:18