tl;dr:
Use Out-File -Encoding oem
to produce files that cmd.exe
reads correctly.
This effectively limits you to the 256 characters available in the legacy "ANSI" / OEM code pages, except NUL
(0x0
). See bottom section if you need full Unicode support.
In Windows PowerShell (but not PowerShell Core), Out-File
and its effective alias >
default to UTF-16LE character encoding, where most characters are represented as 2-byte sequences; for characters in the ASCII range, the 2nd byte of each sequence is NUL
(0x0
); additionally, such files start with a BOM that indicates the type of encoding.
By contrast, cmd.exe
expects input to use the legacy single-byte OEM encoding (note that starting cmd.exe
with /U
only controls the encoding of its output).
When cmd.exe
(unbeknownst to it) encounters UTF-16LE input:
It interprets the bytes individually as characters (even though characters in UTF-16LE are composed of 2 bytes (typically), or, in rare cases, of 4 (a pair of 2-byte sequences)).
It interprets the 2 bytes that make up the BOM (0xff
, 0xfe
) as part of the string. With OEM code page 437
(US-English) in effect, 0xff
renders like a space, whereas 0xfe
renders as ■
.
Reading stops once the first NUL
(0x0
byte) is encountered, which happens with the 1st character from the ASCII range, which in your sample string is 1
.
Therefore, string 1i32l54bl5b2hlthtl098
encoded as UTF-16LE is read as ■1
, as you state.
If you need full Unicode support, use UTF-8 encoding:
Use Out-File -Encoding utf8
in PowerShell.
Before reading the file in cmd.exe
(in a batch file), run chcp 65001
in order to switch to the UTF-8 code page.
Caveats:
Not all Unicode chars. may render correctly, depending on the font used in the console window.
Legacy applications may malfunction with code page 65001
in effect, especially on older Windows versions.
- A possible strategy to avoid problems is to temporarily switch to code page
65001
, as needed, and then switch back.
Note that the above only covers communication via files, and only in one direction (PowerShell -> cmd.exe).
To also control the character encoding used for the standard streams (stdin, stdout, stderr), both when sending strings to cmd.exe / external programs and when interpreting strings received from them, see this answer of mine.