that PowerShell ISE seems to encode string constants in ANSI.
That only applies when communicating with external programs, whereas you're using in-process .NET APIs.
As an aside: this discrepancy with regular console windows, which use the active OEM code page is one of the reasons that make the obsolescent ISE problematic - see the bottom section of this answer for more information.
String literals in memory are always .NET strings, which are UTF-16-encoded (composed of 16-bit Unicode code units), capable of representing all Unicode characters.[1]
To send UTF-8 strings, specify charset=utf-8
as part of the -ContentType
argument; e.g.:
Invoke-RestMethod -ContentType 'text/plain; charset=utf-8' ...
On receiving strings, PowerShell automatically decodes them either based on an explicitly specified charset
field (character encoding) in the response's content header or, in its absence using ISO-8859-1 (which is closely related to, but in effect a subset of Windows-1252).
- If a given response doesn't specify a
charset
but in actually uses a different encoding from ISO-8859-1 - say UTF-8 - PowerShell will misinterpret the strings received, which requires re-encoding after the fact - see this answer.
Character encoding when communicating with external programs:
If you need to send a string with a particular encoding to an external program (via the pipeline, which the target program receives via stdin), set the $OutputEncoding
preference variable to that encoding, and PowerShell will automatically convert your .NET strings to the specified encoding.
To send UTF-8-encoded strings to external programs via the pipeline:
$OutputEncoding = [System.Text.UTF8Encoding]::new()
Note, however, that this alone isn't sufficient in order to correctly receive UTF-8 output from external programs; for that, you need to set [Console]::OutputEncoding
to the same encoding.
To make your PowerShell session fully UTF-8-aware (irrespective of whether in the ISE or a regular console window):
# Needed in the ISE only:
chcp >$null # Dummy console-program call that ensures that a console is allocated.
# Set all encodings relevant to communicating with external programs to UTF-8.
$OutputEncoding = [Console]::InputEncoding = [Console]::OutputEncoding =
[System.Text.UTF8Encoding]::new()
See this answer for more information.
[1] Note, however, that Unicode characters with a code point greater than 0xFFFF
, i.e. those outside the so-called BMP (Basic Multilingual Plane), must be represented with two 16-bit code units ([char]
), namely so-called surrogate pairs.