-2

Editor's note:

  • The macOS sed command below performs an in-place (-i '') string-substitution (string-replacement) operation on the given file, i.e. it transforms the file's existing content. The specific substitution shown, s/././g, replaces all non-newline characters (regex metacharacter .) with verbatim . characters, so be careful when trying the command yourself.

  • While the intended question may ultimately be a different one, as written the question is well-defined, and can be answered to show the full PowerShell equivalent of the sed command (a partial translation is in the question itself), notably including the in-place updating of the file.


I have a mac command and i need it to run on windows. I have no experience in mac whatsoever.

sed -i '' 's/././g' dist/index.html

After research i found that i should use

get-content path | %{$_ -replace 'expression','replace'}

but can't get it to work yet.

mklement0
  • 382,024
  • 64
  • 607
  • 775
Muhammad Mahmoud
  • 621
  • 1
  • 5
  • 21

1 Answers1

3

Note:

  • The assumption is that s/././g in your sed command is just a example string substitution that you've chosen as a placeholder for real-world ones. What this example substitution does is to replace all characters other than newnlines (regex .) with a verbatim . Therefore, do not run the commands below as-is on your files, unless you're prepared to have their characters turn into .

The direct translation of your sed command, which performs in-place updating of the input file, is (ForEach-Object is the name of the cmdlet that the built-in % alias refers to):

(Get-Content dist/index.html) | 
  ForEach-Object { $_ -replace '.', '.' } |
    Set-Content dist/index.html -WhatIf

Note: The -WhatIf common parameter in the command above previews the operation. Remove -WhatIf once you're sure the operation will do what you want.

Or, more efficiently:

(Get-Content -ReadCount 0 dist/index.html) -replace '.', '.' | Set-Content dist/index.html -WhatIf

-ReadCount 0 reads the lines into a single array before outputting the result, instead of the default behavior of emitting each line one by one to the pipeline.

Or, even more efficiently, if line-by-line processing isn't required and the -replace operation can be applied to the entire file content, using the -Raw switch:

(Get-Content -Raw dist/index.html) -replace '.', '.' | Set-Content -NoNewLine dist/index.html -WhatIf

Note:

  • -replace, the regular-expression-based string replacement operator uses the syntax <input> -replace <regex>, <replacement> and invariably performs global replacements (as requested by the g option in your sed command), i.e. replaces all matches it finds.

    • Unlike sed's regular expressions, however, PowerShell's are case-insensitive by default; to make them case-sensitive, use the -creplace operator variant.
  • Note the required (...) around the Get-Content call, which ensures that the file is read into memory in full and closed again first, which is the prerequisite for being able to rewrite the file with Set-Content in the same pipeline.

    • Caveat: While unlikely, this approach can result in data loss, namely if the write operation that saves back to the input file gets interrupted.
  • You may need -Encoding with Set-Content to ensure that the rewritten file uses the same character encoding as the original content - Get-Content reads text files into .NET strings recognizing a variety of encodings, and no information is retained as to what encoding was encountered.

  • Except with the Get-Content -Raw / Set-Content -NoNewLine solution, which preserves the original newline format, the output file will use the platform-native newline format - CRLF (\r\n) on Windows, LF (\n) on Unix-like platforms - irrespective of which format the input file originally used.

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • Would using `Get-Content -Raw` change anything; better or worse. Are there any hazards in processing markup language as described in https://stackoverflow.com/a/1732454/447901? – lit Feb 16 '22 at 23:14
  • @lit, if _line-by-line_ processing isn't required, `-Raw`, which reads the entire file into a _single, multiline string_, provides a significant performance boost. As the post you link to argues, regexes shouldn't be used to parse markup languages. Please also see my update re `-ReadCount 0`. – mklement0 Feb 17 '22 at 00:15