Need to substitute \x0d\x0a to \x2c\x0d\x0a in a file using powershell

Question

Need to replace \x0d\x0a with \x2c\x0d\x0a in a file.

I can do it relatively easy on Unix:

awk '(NR>1){gsub("\r$",",\r")}1' $file > "fixed_$file":

Need help with implementing this in PowerShell.

Thank you in advance.

I also encourage you to revisit your previous questions to see if answer there should be accepted. — mklement0, Oct 24 '18 at 01:12

mklement0 · Answer 1 · 2018-10-24T01:02:18.743

3

Assuming that you're running this on Windows (where \r\n (CRLF) newlines are the default), the following command is the equivalent of your awk command:

Get-Content $file | ForEach-Object { 
  if ($_.ReadCount -eq 1) { $_ } else { $_ -replace '$', ',' }
} | Set-Content "fixed_$file"

Caveat: The character encoding of the input file is not preserved, and Set-Content uses a default, which you can override with -Encoding.
In Windows PowerShell, this default is the system's "ANSI" encoding, whereas in PowerShell Core it is BOM-less UTF-8.

Get-Content $file reads the input file line by line.
The ForEach-Object loop passes the 1st line ($_.ReadCount -eq 1) through as-is ($_), and appends , (which is what escape sequence \x2c in your awk command represents) to all others ($_ -replace '$', ',').
- Note: $_ + ',' or "$_," are simpler alternatives for appending a comma; the regex-based
  -replace operator was used here to highlight the PowerShell feature that is similar to awk's gsub().
Set-Content then writes the resulting lines to the target file, terminating each with the platform-appropriate newline sequence, which on Windows is CRLF (\r\n).

edited Oct 24 '18 at 01:02

answered Oct 23 '18 at 21:21

mklement0

382,024
64
607
775

i've been replacing \r\n when i could have been replacing the regex $ all this time? THANK YOU! – Robert Cotterman Oct 24 '18 at 01:36
@RobertCotterman: With line-by-line input: yes, given that lines are usually stripped of a trailing newline. However, if the input is a multi-line string and you want to match line endings, more work is needed. – mklement0 Oct 24 '18 at 01:41
Ah thanks. I also have NEVER used set-content, it seems it can't append, why would i use set-content rather than out-file? i mean there is add-content too, but this seems to get tricky. Why powershell? I tried to read the differences and there are some, but do you know of examples where one out weighs the other no matter what? (not a crazy answer needed, maybe you already wrote an answer somewhere?) – Robert Cotterman Oct 24 '18 at 01:43
@RobertCotterman: Please see https://stackoverflow.com/a/44246434/45375 – mklement0 Oct 24 '18 at 01:46
So it seems the 2 main reasons are encoding and speed. And since encoding is no longer an issue (lucky for me since I'm newer to powershell) it seems speed is the main benefit. And I suppose passthru. Thank you for some insight. I feel set-content proves veteranism. – Robert Cotterman Oct 24 '18 at 01:53
@RobertCotterman: Yes, encoding differences do not affect PowerShell _Core_, but they will continue to plague _Windows PowerShell_. Also note that with _non-string_ types the fundamental difference between `Out-File` and `Set-Content` described in the linked answer will never go away. – mklement0 Oct 24 '18 at 01:58

Need to substitute \x0d\x0a to \x2c\x0d\x0a in a file using powershell

1 Answers1