24

I am using Sandcastle Helpfile Builder to produce a helpfile (.chm). The project is a .shfbproj file, which is XML format, works with msbuild.

I want to automatically update the Footer text that appears in the generated .chm file. I use this snippet:

$newFooter = "<FooterText>MyProduct v1.2.3.4</FooterText>";

get-content  -Encoding ASCII $projFile.FullName | 
    %{$_ -replace '<FooterText>(.+)</FooterText>', $newFooter } > $TmpFile

move-item $TmpFile $projFile.FullName -force

The output directed to the $TmpFile is always a multi-byte string. But I don't want that. How do I set the encoding of the output to ASCII?

Micha Wiedenmann
  • 19,979
  • 21
  • 92
  • 137
Cheeso
  • 189,189
  • 101
  • 473
  • 713
  • 5
    Thanks Powershell for defaulting to an obscure output format (UCS-2 which was replaced in 1996!!??). utf8 would have been fine ;) – lmat - Reinstate Monica Nov 02 '12 at 19:56
  • 1
    See also: https://stackoverflow.com/questions/40098771/changing-powershells-default-output-encoding-to-utf-8 – Ian Kemp Jun 09 '17 at 06:45
  • 1
    It's possible to mix encodings with ">>" too. I would stick with set-content and add-content. – js2010 May 06 '19 at 20:45
  • @LimitedAtonement: Yes, that decision was unfortunate, but fortunately the problem was rectified in PowerShell _Core_ (v6+), where BOM-less UTF-8 is now the default. As an aside: what Windows PowerShell by default uses with `>` / `Out-File` (but not with `Set-Content`) is _UTF-16LE_ (which is part of the Unicode standard and won't go away), _not_ UCS-2 (which is indeed obsolete, because it couldn't represent _all_ Unicode characters). – mklement0 Oct 30 '19 at 21:26

6 Answers6

21

You could change the $OutputEncoding variable before writing to the file. The other option is not to use the > operator, but instead pipe directly to Out-File and use the -Encoding parameter.

Tomalak
  • 332,285
  • 67
  • 532
  • 628
tomasr
  • 13,683
  • 3
  • 38
  • 30
  • 3
    the > operator is actually an alias to out-file, which defaults to utf. tomasr is correct that you should use out-file specifically to set the encoding. – James Pogran Nov 10 '09 at 15:27
  • 1
    @JamesPogran What about `2>` ? I'm not sure how, in PowerShell to run a command like `cmd > out 2> err` through the pipes to out-file (explicitly specifying an encoding). Also, "which defaults to utf"; what do you mean by utf? utf8? utf16? utf32? I thought the default is UCS-2 anyway, not UTF (UCS Transfer Format); did you see it documented otherwise somewhere? – lmat - Reinstate Monica Nov 02 '12 at 20:05
  • 2
    The `Out-File -Encoding` recommendation is helpful, but `$OutputEncoding` does _not_ apply here, because it is unrelated to `>` / `>>`. Its sole purpose is to define the encoding to use when sending data from PowerShell _to an external program_. – mklement0 Oct 30 '19 at 21:19
  • e.g. `git diff HEAD~1 HEAD | Out-File -Encoding ascii patch.txt` – Robert Bernstein Feb 05 '20 at 21:34
  • Your link is dead – Eric Aug 19 '21 at 09:27
13

The > redirection operator is a "shortcut" to Out-File. Out-File's default encoding is Unicode but you can change it to ASCII, so pipe to Out-File instead:

Get-Content -Encoding ASCII $projFile.FullName |
    % { $_ -replace '<FooterText>(.+)</FooterText>', $newFooter } |
    Out-File $tmpfile -Encoding ASCII
mklement0
  • 382,024
  • 64
  • 607
  • 775
Shay Levy
  • 121,444
  • 32
  • 184
  • 206
  • 1
    Zomfg what a pain in the ass, http://stackoverflow.com/q/8216027/124486 So, essentially `generate_bat_file.pl > run_all.bat` is totally useless because executing a bat file -- even in PowerShell -- will result in an error if that `.bat` file is encoded in UTF16. – Evan Carroll Nov 21 '11 at 19:15
7

| sc filename does the trick (sc being an alias for Set-Content)

for >> filename use | ac filename does the trick (ac being an alias for Add-Content)

Ruben Bartelink
  • 59,778
  • 26
  • 187
  • 249
4

I found I had to use the following:

write-output "First line" | out-file -encoding ascii OutputFileName
write-output "Next line" | out-file -encoding ascii -append OutputFileName
....

Changing the output encoding using:

$OutputEncoding = New-Object -typename System.Text.ASCIIEncoding

did not work

Tom Hallam
  • 141
  • 4
  • Indeed, `$OutputEncoding` does _not_ apply here, because it is unrelated to `>` / `>>` / `Out-File` (or _any_ cmdlet, for that matter). Its sole purpose is to define the encoding to use when sending data from PowerShell _to an external program_. However, that observation would have gained more exposure as a comment on the highly up-voted, but partially incorrect, accepted answer (I've since posted a comment - let's hope the answer gets fixed). You then wouldn't have needed to post an answer of your own (the rest of which duplicates Shay Levy's older answer). – mklement0 Oct 30 '19 at 21:37
3

You can set the default encoding of out-file to be ascii:

$PSDefaultParameterValues=@{'out-file:encoding'='ascii'}

Then something like this will result in an ascii file:

echo hi > out

In powershell 6 and 7, the default encoding of out-file was changed to utf8 no bom.

js2010
  • 23,033
  • 6
  • 64
  • 66
  • Indeed, but note that in Windows PowerShell this [only works in the latest - and final - version, v5.1](https://stackoverflow.com/a/40098904/45375). Also worth mentioning that (BOM-less) UTF-8 is PowerShell Core's (v6+) default encoding across _all_ cmdlets. – mklement0 Oct 30 '19 at 21:34
2

Just a little example using streams, although I realize this wasn't the original question.

C:\temp\ConfirmWrapper.ps1 -Force -Verbose 4>&1 6>&1 | Out-File -Encoding default -FilePath C:\temp\confirmLog.txt -Append

Will output the information(6) and verbose(4) streams to the output(1) stream and redirect all that to the out-file with ANSI(default) encoding.

Alexis Coles
  • 1,227
  • 14
  • 19