1

I need help removing double quotes in the content of all the files in a directory. I can get it done one file at a time and its fast too not sure how to do it for all files without impacting performance. There are more than 600 files in the directory.

PS script for one file:

(gc C:\Temp\data.txt -En UTF8) | ForEach-Object {$_ -replace '"',''} | Out-File C:\Temp\data.txt -En UTF8

Trying with all the files in a folder using code below. But, its been too slow.

PS script for all files:

Get-ChildItem "C:\Temp" -Filter *.txt | 
Foreach-Object {
    $content = Get-Content $_.FullName

    #filter and replace content to the original file
    $content | % {$_ -replace '"', ''} 

    #save content to the same file name
    $content | Out-File $File.BaseName -En UTF8
}
paone
  • 828
  • 8
  • 18
  • 1
    `Get-Content` is [known to be slow](https://stackoverflow.com/questions/47349306/powershell-get-content-with-basic-manipulations-so-slow) – boxdog May 14 '21 at 22:11
  • Are your files just text files or CSV files? If CSV, then simply removing all quotes is hazardous. Read [this](https://stackoverflow.com/a/60681762/9898643) – Theo May 15 '21 at 13:01
  • Hi @Theo, they are all text files. – paone May 15 '21 at 16:21

1 Answers1

2

Assuming each input text file fits into memory as a whole (which is likely), you can use
Get-Content's -Raw switch
as follows, which greatly speeds up your operation:

Get-ChildItem C:\Temp -Filter *.txt | 
  Foreach-Object {
    (Get-Content -Raw $_.FullName) -replace '"' |
      Set-Content -NoNewLine $File.BaseName -En UTF8
  }

Note:

  • -Raw reads the entire contents of a file into memory as a single, (typically) multi-line string.

  • Omitting the replacement operand of the -replace operator implicitly uses '' (the empty string) as the replacement string.

  • If the input already is text (strings), Set-Content performs better than Out-File / >.

  • -NoNewLine (PSv5+) ensures that Set-Content / Out-File don't blindly append a newline. (It actually also suppresses placing newlines between the (stringified) input objects, but in the case at hand there's only one input object).

mklement0
  • 382,024
  • 64
  • 607
  • 775