Get-Content -Raw
makes PowerShell read the entire file into a single string.
.NET can't store individual objects over 2GB in size in memory, and each character in a string takes up 2 bytes, so after reading the first ~1 billion characters (roughly equivalent to a 1GB ASCII-encoded text file), it reaches the memory limit.
Remove the -Raw
switch, -replace
is perfectly capable of operating on multiple input strings at once:
(Get-Content -path C:\Workspace\workfile\myfile.txt) -replace '\"', '"' | Set-Content C:\Workspace\workfile\myfileCLEAN.txt
Beware that -replace
is a regex operator, and if you want to remove \
from a string, you need to escape it:
(Get-Content -path C:\Workspace\workfile\myfile.txt) -replace '\\"', '"' | Set-Content C:\Workspace\workfile\myfileCLEAN.txt
While this will work, it'll still be slow due to the fact that we're still loading >2GB of data into memory before applying -replace
and writing to the output file.
Instead, you might want to pipe the output from Get-Content
to the ForEach-Object
cmdlet:
Get-Content -path C:\Workspace\workfile\myfile.txt |ForEach-Object {
$_ -replace '\\"','"'
} |Set-Content C:\Workspace\workfile\myfileCLEAN.txt
This allows Get-Content
to start pushing output prior to finishing reading the file, and PowerShell therefore no longer needs to allocate as much memory as before, resulting in faster execution.