1

Need to replace \x0d\x0a with \x2c\x0d\x0a in a file.

I can do it relatively easy on Unix:

awk '(NR>1){gsub("\r$",",\r")}1' $file > "fixed_$file":

Need help with implementing this in PowerShell.

Thank you in advance.

mklement0
  • 382,024
  • 64
  • 607
  • 775
daimne
  • 47
  • 1
  • 6

1 Answers1

3

Assuming that you're running this on Windows (where \r\n (CRLF) newlines are the default), the following command is the equivalent of your awk command:

Get-Content $file | ForEach-Object { 
  if ($_.ReadCount -eq 1) { $_ } else { $_ -replace '$', ',' }
} | Set-Content "fixed_$file"

Caveat: The character encoding of the input file is not preserved, and Set-Content uses a default, which you can override with -Encoding.
In Windows PowerShell, this default is the system's "ANSI" encoding, whereas in PowerShell Core it is BOM-less UTF-8.

  • Get-Content $file reads the input file line by line.

  • The ForEach-Object loop passes the 1st line ($_.ReadCount -eq 1) through as-is ($_), and appends , (which is what escape sequence \x2c in your awk command represents) to all others ($_ -replace '$', ',').

    • Note: $_ + ',' or "$_," are simpler alternatives for appending a comma; the regex-based
      -replace operator was used here to highlight the PowerShell feature that is similar to awk's gsub().
  • Set-Content then writes the resulting lines to the target file, terminating each with the platform-appropriate newline sequence, which on Windows is CRLF (\r\n).

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • i've been replacing \r\n when i could have been replacing the regex $ all this time? THANK YOU! – Robert Cotterman Oct 24 '18 at 01:36
  • @RobertCotterman: With line-by-line input: yes, given that lines are usually stripped of a trailing newline. However, if the input is a multi-line string and you want to match line endings, more work is needed. – mklement0 Oct 24 '18 at 01:41
  • Ah thanks. I also have NEVER used set-content, it seems it can't append, why would i use set-content rather than out-file? i mean there is add-content too, but this seems to get tricky. Why powershell? I tried to read the differences and there are some, but do you know of examples where one out weighs the other no matter what? (not a crazy answer needed, maybe you already wrote an answer somewhere?) – Robert Cotterman Oct 24 '18 at 01:43
  • @RobertCotterman: Please see https://stackoverflow.com/a/44246434/45375 – mklement0 Oct 24 '18 at 01:46
  • So it seems the 2 main reasons are encoding and speed. And since encoding is no longer an issue (lucky for me since I'm newer to powershell) it seems speed is the main benefit. And I suppose passthru. Thank you for some insight. I feel set-content proves veteranism. – Robert Cotterman Oct 24 '18 at 01:53
  • @RobertCotterman: Yes, encoding differences do not affect PowerShell _Core_, but they will continue to plague _Windows PowerShell_. Also note that with _non-string_ types the fundamental difference between `Out-File` and `Set-Content` described in the linked answer will never go away. – mklement0 Oct 24 '18 at 01:58