0

I am trying to perform a replace operation on a data file which is 4GB. But I am not even able to read this file due to memory exception. The following command gives a memory error.

$edwfile = (Get-Content C:\Users\tomgeorg\Desktop\edw_ord_extr_3x_SIQP_20181021.182305\edw_ord_extr_3x_SIQP_20181021.182305.dat -Raw ) 

Is there any alternative commands or tricks to process huge file.

I want to run the following replace pattern on each line in the file.basically I want to remove all the unwanted special characters.

-replace  "[$([char]0x00)-$([char]0x09)$([char]0x0B)-$([char]0x1F)$([char]0x7F)-$([char]0xFF)]","?"

system details

enter image description here

TomG
  • 281
  • 1
  • 2
  • 20
  • 2
    Don't use `-Raw` with large files. You'll need to process it in chunks using either the pipeline, `-ReadCount`, `-Stream`, or some combination of the above. – Maximilian Burszley Oct 24 '18 at 16:01
  • NB: At the moment you're just showing how to load the entire file into a single variable; so we can't give you much more help than @TheIncorrigible1's previous comment. If you need more assistance, please share info on what your process does / ideally sharing the related code, and we can advise how to better implement this suggestion. – JohnLBevan Oct 24 '18 at 16:03
  • 2
    Also don't put `Get-Content` in a subexpression/grouping expression or assign its output to a variable. Use the pipeline for processing the file(s) one line at a time. – Ansgar Wiechers Oct 24 '18 at 16:03
  • 1
    Some people use `Get-Content -ReadCount`. Many people turn to `System.IO.StreamReader` for performance gains. https://stackoverflow.com/questions/6855814/powershell-how-to-count-number-of-rows-in-csv-file/13992221#13992221 – lit Oct 24 '18 at 16:18

2 Answers2

3

Below is the sample solution with streams. It reads file line by line and then add updated line to a new file.

$reader = [System.IO.StreamReader]"C:\temp\OriginalFile.txt"
$writer = [System.IO.StreamWriter]"C:\temp\UpdatedFile.txt"

while (!$reader.EndOfStream) {

$writer.WriteLine(($reader.ReadLine() -replace '\|', ";"))

}

$reader.Close()
$writer.Close()
Mike Twc
  • 2,230
  • 2
  • 14
  • 19
2

Assuming you are expecting to work on one line at a time, you'll want to use the pipeline for your task:

$path = '~\Desktop\edw_ord_extr_3x_SIQP_20181021.182305\edw_ord_extr_3x_SIQP_20181021.182305.dat'
Get-Content -Path $path | ForEach-Object {
    # do something line-by-line with the file
} | # -> do something else with the output

Without knowing what you're doing with the file, it's hard to give a more complete answer.

Maximilian Burszley
  • 18,243
  • 4
  • 34
  • 63