I'm getting the contents of a file and keeping just the lines that match a regex or empty line. But writing the results, e.g. smaller amount of data, is taking ages... Here is the code in question (I've added a few lines for debugging/measuring):
$original = Get-Content "$localDir\$ldif_file"
(Measure-Command -Expression { $original | Out-File "$localDir\Original-$ldif_file" }).TotalSeconds
$lines = ($original | Measure-Object -Line).Lines
"lines of `$original = $lines"
# Just keep lines of interest:
$stripped = $original | select-string -pattern '^custom[A-Z]','^$' -CaseSensitive
$lines = ($stripped | Measure-Object -Line).Lines
"lines of `$stripped = $lines"
(Measure-Command -Expression { $stripped | Out-File "$localDir\Stripped-$ldif_file" }).TotalSeconds
"done"
Problem: it takes for the smaller ($stripped) data 342 seconds to be written to a file (about 30 times longer than the $original data)! See output below:
11.5371677
lines of $original = 188715
lines of $stripped = 126404
342.6769547
done
Why is the Out-File of $stripped so much slower than the one of $original? How to improve it?
Thanks!