I know that you personally think the size of the file may not be the actual problem, but it's worth revisiting the fundamentals for the benefit of other readers.
Get-Content
, when used in a pipeline, reads lines from a file one at a time.
This object-by-object processing is a core feature of PowerShell's pipeline and acts as a memory throttle (no need to read all input into memory at once.
There are only three scenarios where Get-Content
reads the whole file into memory:
If you capture Get-Content
's output in a variable ($content = Get-Content ...
), in which case the variable receives an array comprising all lines.
If you enclose the Get-Content
call in (...)
, $(...)
, or @(...)
, which also returns an array of all lines.
If you use the -Raw
switch, which makes Get-Content
return a single, multi-line string.
Using -TotalCount 100
(or -First 100
) doesn't change this fundamental behavior: after 100
lines have been read, Get-Content
stops reading and closes the file.
The code in your question therefore doesn't explain your symptom - you shouldn't run out of memory - at least not because the input file is large; if it still happens, you may be seeing a bug.
If you have a reproducible case, I encourage you to file a bug in the Windows PowerShell UserVoice forum or, if you can (also) reproduce the bug in PowerShell [Core] v6+, at the PowerShell Core GitHub repo.
In the meantime, you can consider using .NET directly, which also generally
faster than using PowerShell's cmdlets:
[Linq.Enumerable]::Take([IO.File]::ReadLines("$PWD/input_file_name.Tmp"), 100) |
Out-File -Encoding Default output_file_name_100.Tmp
Note:
• The use of "$PWD/"
as part of the input file path, because .NET's working directory typically differs from PowerShell's.
• In PowerShell type literals ([...]
), the System.
part of the full type name can be omitted; thus [Linq.Enumerable]
refers to System.Linq.Enumerable
, and [IO.File]
to System.IO.File