2

We have a domain-wide automation tool that can start jobs on servers as admin (Stonebranch UAC - please note: this has nothing to do with Windows "User Access Control", Stronebranch UAC is an Enterprise automation tool). Natively, it looks for Cmd batch scripts, so we use those.

However, I prefer to use PowerShell for everything, so I bulk created dozens of .bat scripts using PowerShell. Nothing worked, the automation tool broke whenever it tried to run the .bat scripts. So I pared back the scripts so that they contained a single line echo 123 and still, everything was broken. We thought it was a problem with the tool, but then tried to run the .bat scripts on the server and they were broken too, just generating some unicode on the command line and failing to run.

So it dawned on us that something about how PowerShell pumps Write-Output commands to create the batch scripts was breaking them (this is on Windows 2012 R2 and PowerShell is 5.1). And I repeat this, for example, if I type the following on a PowerShell console:

Write-Output "echo 123" > test.bat

If I now open a cmd.exe and then try to run test.bat, I just get a splat of 2 unicode-looking characters on the screen and nothing else.

Can someone explain to me a) why this behaviour happens, and b) how can I continue to use PowerShell to generate these batch scripts without them being broken? i.e. do I have to change BOM or UTF-8 settings or whatever to get this working and how do I do that please?

YorSubs
  • 3,194
  • 7
  • 37
  • 60
  • 1
    not a PowerShell expert, but as far as I know, PowerShell writes Unicode as default. There is a switch to write ANSI/ASCII, but I don't remember... – Stephan Nov 17 '20 at 16:21
  • 1
    as Stephan pointed out, you need to deal with encoding for your file. DO NOT use the redirection stuff ... instead use one of the "to file" commands that allows you to explicitly set the encoding. look at what you get from a search for `powershell set file encoding` ... [*grin*] – Lee_Dailey Nov 17 '20 at 16:23
  • So you think that `Set-Content` with the encoding flag is the way (I can never get my head around all of the Unicode and UTF8 and BOMs and all that stuff!)? hmm, or anyone have an example of what is the *best* way to create batch files from PowerShell (or even, how might I take a file and use some special cmdlet that will *transmogrify* that file into a non-Unicode file?) https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.management/set-content?view=powershell-7.1 – YorSubs Nov 17 '20 at 16:27

2 Answers2

2

In Windows PowerShell, >, like the underlying Out-File cmdlet, invariably[1] creates "Unicode" (UTF-16LE) files, which cmd.exe cannot read (not even with the /U switch).

In PowerShell [Core] v6+, BOM-less UTF-8 encoding is consistently used, including by >.

Therefore:

  • If you're using PowerShell [Core] v6+ AND the content of the batch file comprises ASCII-range characters only (7-bit range), you can get away with >.

  • Otherwise, use Set-Content with -Encoding Oem.

'@echo 123' | Set-Content -Encoding Oem test.bat

If your batch-file source code only ever contains ASCII-range characters (7-bit range), you can get also get away with (in both PowerShell editions):

'@echo 123' | Set-Content test.bat

Note:

  • As the -Encoding argument implies, the system's active OEM code page is used, which is what batch files expect.

  • OEM code pages are supersets of ASCII encoding, so a file saved with -Encoding Oem that is composed only of ASCII-range characters is implicitly also an ASCII file. The same applies to BOM-less UTF-8 and ANSI (Default) encoded files composed of ASCII-range characters only.

  • -Encoding Oem - as opposed to -Encoding Ascii or even using Set-Content's default encoding[2] - therefore only matters if you have non-ASCII-range characters in your batch file's source code, such as é. Such characters, however, are limited to a set of 256 characters in total, given that OEM code pages are fixed-width single-byte encodings, which means that many Unicode characters are inherently unusable, such as .


[1] In Windows PowerShell v5.1 (and above), it is possible to change >'s encoding via the $PSDefaultParameterValues preference variable - see this answer - however, you won't be able to select a BOM-less UTF-8 encoding, which would be needed for creation of batch files (composed of ASCII-range characters only).

[2] Set-Content's default encoding is the active ANSI code page (Default) in Windows PowerShell (another ASCII superset), and (as for all cmdlets) BOM-less UTF-8 in PowerShell [Core] v6+; for an overview of the wildly inconsistent character encodings in Windows PowerShell, see this answer.

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • Does there happen to be a cmdlet that can look at a file and just return only the encoding information? (I thought maybe `Get-Content -Encoding` but that generates an error. Also, is there a cmdlet that might take a file, read it, then delete that file and replace it by a file with the same content but different encoding? – YorSubs Nov 18 '20 at 07:07
  • 1
    @YorSubs: There is nothing built in, but the [PowerShellCookbook module](https://www.powershellgallery.com/packages/PowerShellCookbook) has a `Get-FileEncoding` cmdlet to add a similar command directly to PowerShell). Similarly, there is no built-in transcoding cmdlet, but you can use .NET to achieve that; see [this answer](https://stackoverflow.com/a/53283685/45375), for example. – mklement0 Nov 19 '20 at 19:36
  • 1
    Note that there's a [longstanding feature request](https://github.com/PowerShell/PowerShell/issues/2290) to bring something like `Get-FileEncoding` directly to PowerShell, and in a related [RFC discussion](https://github.com/PowerShell/PowerShell-RFC/issues/67) a transcoding cmdlet (`Convert-FileEncoding`) was briefly discussed, but no one stepped up to pursue this further. – mklement0 Nov 19 '20 at 19:37
  • 1
    Thanks for all of the information, really interesting. Real shame they did not include a `Get-FileEncoding` cmdlet, it's just informational so can't break anything and would be really useful (i.e. tests on `if ($(Get-FileEncoding file) -eq "OEM") { do a thing }`. Linux has always had things like `dos2unix`, so again, it's a shame that they won't implement something like `Convert-FileEncording` as encoding is actually a nightmare to deal with since it is invisible and inscrutable (it just bites us when we least expect it and causes confusion as the file *looks* ok otherwise). – YorSubs Nov 20 '20 at 11:01
  • 1
    Yes, it would be helpful. Note that `dos2unix` only converts _line endings_, not character encoding, but Unix-like platforms have the `file` utility, which also reports encoding, and the `iconv` utility for transcoding. If you have WSL installed, you could at least use them from there (you can target the Windows file-system). – mklement0 Nov 20 '20 at 14:21
1

a method that by default creates bom-less utf8 files is:

    new-item -Path outfile.bat -itemtype file -value "echo this"
T3RR0R
  • 2,747
  • 3
  • 10
  • 25
  • 2
    Indeed (yet another inconsistency in Windows PowerShell), but this only works the batch-file content is limited to ASCII-range characters (7-bit code points), and in that case you can simply use `'echo this' | Set-Content outfile.bat`. If characters in the 8-bit range are present, `'echo this' | Set-Content -Enocding oem outfile.bat` must be used. – mklement0 Nov 17 '20 at 17:07