2

I have a pipe delimited .TXT file. I need to change the delimiter to a comma instead but still keep the file extension as .TXT. The file looks like this:

Column 1 |Column 2

13|2019-09-30

96|2019-09-26

173|2019-09-25

I am using Windows Powershell 5.1 version for my script.

I am using the following code:

$file = New-Object System.IO.StreamReader -Arg "c:\file.txt"
$outstream = [System.IO.StreamWriter] "c:\out.txt"
while ($line = $file.ReadLine()) {
$s = $line -replace '|', ','
$outstream.WriteLine($s)
}
$file.close()
$outstream.close()

Instead of just replacing the pipe with a comma, the output file looks like this:

C,o,l,u,m,n, 1 , |,C,o,l,u,m,n, 2

1,3,|,2,0,1,9,-,0,9,-,3,0

9,6,|2,0,1,9,-,0,9,-,2,6

1,7,3,|,2,0,1,9,-,0,9,-,2,5

Arijita
  • 21
  • 2

3 Answers3

2

The only problem with your answer is in how you try to replace the | characters in the input:

$s = $line -replace '|', ',' # WRONG

PowerShell's -replace operator expects a regex (regular expression) as its first RHS operand, and | is a regex metacharacter (has special meaning)[1]; to use it as a literal character, you must \-escape it:

# '\'-escape regex metacharacter '|' to treat it literally.
$s = $line -replace '\|', ','

While PowerShell's -replace operator is very flexible, in simple cases such as this one you can alternatively use the [string] type's .Replace() method, which performs literal string replacements and therefore doesn't require escaping (it's also faster than -replace):

# Use literal string replacement.
# Note: .Replace() is case-*sensitive*, unlike -replace
$s = $line.Replace('|', ',')

[1] | denotes an alternation in a regex, meaning that the subexpressions on either side are matched against the input string and one of them matching is sufficient; if your full regex is just |, it effectively matches the empty string before and after each character in the input, which explains your symptom; e.g., 'foo' -replace '|', '@' yields @f@o@o@

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • 1
    This is a very good explanation, and also applies to `-split` in my answer. – Jeff Zeitlin Oct 04 '19 at 10:55
  • Thanks, @JeffZeitlin; yes, your `-split` / `-join` combo definitely works too, and the points about `-replace` vs. `.Replace()` apply analogously to `-split` vs. `.Split()`. – mklement0 Oct 04 '19 at 12:05
  • 1
    Thank you! Escaping makes sense and worked beautifully. I tried .Replace() as well, and that did it a couple second quicker than the -replace with escape. – Arijita Oct 04 '19 at 13:25
  • I'm glad to hear it, @Arijita. Allow me to give you the standard advice to newcomers in the next comment: – mklement0 Oct 04 '19 at 13:27
1

You can use Import-Csv and Export-Csv by specifying the -Delimiter.

Import-Csv -Delimiter '|' -Path "c:\file.txt" | Export-Csv -Delimiter ',' -Path "c:\file.txt" -NoTypeInformation
Jacob
  • 1,182
  • 12
  • 18
  • 2
    `Export-CSV` will generate a bogus line at the top of the file, unless you include the `-NoTypeInformation` switch. But also see my comment to @StanislavCastek with my answer. – Jeff Zeitlin Oct 04 '19 at 10:58
  • 2
    Thank you! (I was avoiding this way since it is a huge file and this takes a while to finish. ) – Arijita Oct 04 '19 at 13:27
  • @JeffZeitlin, You are correct, I always forget to include `-NoTypeInformation` initially. I've updated the answer to include that. @Arijita, that's a fair observation, AFAIK `Import-Csv` is parsing it into an object in memory before `Export-Csv` converts it back to CSV. So consider this the easy method, but definately not the most performant approach. – Jacob Oct 05 '19 at 09:18
0

You will find the -split and -join operators to be of interest.

Get-Content -Path "C:\File.TXT" | ForEach-Object { ($_ -split "\|") -join "," } | Set-Content -Path "C:\Out.TXT"
Jeff Zeitlin
  • 9,773
  • 2
  • 21
  • 33
  • Down voted because answer provided earlier by @Jacob is more effective – Stanislav Castek Oct 04 '19 at 00:19
  • @StanislavCastek - If you try Jacob's solution _as written_, there will be a problem - at the very least, a bogus line generated at the top of the file. While structurally, what the querent posted _appears_ to be conformant with a bog-standard CSV, I _never_ assume that posted sample data is in perfect conformance with reality; if there are other lines in the file that do not match the two-column CSV model, the `Import-CSV`/`Export-CSV` model will fail - whereas mine won't. _If_ the original file is perfectly CSV-conformant as the sample shows, I will concede that Jacob's solution is faster. – Jeff Zeitlin Oct 04 '19 at 10:52
  • @StanislavCastek - Had the querent _specifically_ stated that the original file was a CSV (regardless of the file name), I would have proposed Jacob's solution with the necessary correction. – Jeff Zeitlin Oct 04 '19 at 10:53
  • 1
    In that vein (+1): Even if you use the `-NoTypeInformation` switch, there is another potential complication (which again won't matter if the input file is well-formed and all that matters is whether the output is a valid CSV file): `Export-Csv` will invariably _double-quote_ all field values. – mklement0 Oct 04 '19 at 11:59