1

There is a different InputEncoding used by VS Code PowerShell console vs. the console outside of VS Code. What are the differences and/or pitfalls that might occur because of this?

PS C:\Users\lit> $PSVersionTable.PSVersion.ToString()
7.2.1

PS C:\Users\lit> $Host

Name             : Visual Studio Code Host
Version          : 2021.11.1
InstanceId       : fb65bead-b049-4ac8-befd-54df023771fd
UI               : System.Management.Automation.Internal.Host.InternalHostUserInterface
CurrentCulture   : en-US
CurrentUICulture : en-US
PrivateData      : 
DebuggerEnabled  : True
IsRunspacePushed : False
Runspace         : System.Management.Automation.Runspaces.LocalRunspace

PS C:\Users\lit> [console]::InputEncoding

Preamble          : 
BodyName          : utf-8
EncodingName      : Unicode (UTF-8)
HeaderName        : utf-8
WebName           : utf-8
WindowsCodePage   : 1200
IsBrowserDisplay  : True
IsBrowserSave     : True
IsMailNewsDisplay : True
IsMailNewsSave    : True
IsSingleByte      : False
EncoderFallback   : System.Text.EncoderReplacementFallback
DecoderFallback   : System.Text.DecoderReplacementFallback
IsReadOnly        : False
CodePage          : 65001

However, from a command console outside of VS Code:

PS C:\src\t> $PSVersionTable.PSVersion.ToString()
7.2.1

PS C:\src\t> $host

Name             : ConsoleHost
Version          : 7.2.1
InstanceId       : 41168572-6825-4b84-897a-77432d0a20a3
UI               : System.Management.Automation.Internal.Host.InternalHostUserInterface
CurrentCulture   : en-US
CurrentUICulture : en-US
PrivateData      : Microsoft.PowerShell.ConsoleHost+ConsoleColorProxy
DebuggerEnabled  : True
IsRunspacePushed : False
Runspace         : System.Management.Automation.Runspaces.LocalRunspace

PS C:\src\t> [console]::InputEncoding

IsSingleByte      : True
EncodingName      : OEM United States
WebName           : ibm437
HeaderName        : ibm437
BodyName          : ibm437
Preamble          :
WindowsCodePage   :
IsBrowserDisplay  :
IsBrowserSave     :
IsMailNewsDisplay :
IsMailNewsSave    :
EncoderFallback   : System.Text.InternalEncoderBestFitFallback
DecoderFallback   : System.Text.InternalDecoderBestFitFallback
IsReadOnly        : True
CodePage          : 437
lit
  • 14,456
  • 10
  • 65
  • 119
  • 1
    I think this might give you a hint https://stackoverflow.com/a/57134096/15339544. On Windows OEM and ANSI seem to be the default Encoding as opposed to Linux in my case (UTF-8). VSCode probably does something like: `$OutputEncoding = [Console]::InputEncoding = [Console]::OutputEncoding = [System.Text.Encoding]::UTF8` each time it starts. – Santiago Squarzon Dec 24 '21 at 19:44
  • 1
    VS Code has a separate `$Profile`, so make sure you set `utf8` in both. -- See `$profile | select *` to see the locations for the current terminal. -- By that I mean the `Integrated Terminal`. It doesn't run off the same assembly as the external system. – ninMonkey Dec 24 '21 at 22:08

1 Answers1

2

As of VSCode 1.63 and the PowerShell extension v2021.12.0, the behavior is as follows:

  • VSCode's integrated terminal behaves like regular console windows with respect to character encoding.

    • This means that the system's active OEM code page (e.g., 437 on US-English systems) is in effect by default, as configured in the system locale (language for non-Unicode programs), and as reflected in the output from chcp.com.

    • Note that recent versions of Windows 10 have a still-in-beta feature that allows you to set both the OEM and the ANSI code pages to 65001, i.e. to switch to UTF-8 system-wide, although that has far-reaching consequences - see this answer.

  • The PowerShell extension for VSCode - which comes with its a customized PowerShell terminal called the PowerShell Integrated Console - uses its own settings:

    • It sets [Console]::OutputEncoding to UTF-8 (code page 65001), so that output from external programs is decoded as UTF-8 text. This is arguably the better choice for modern CLIs, such as node.exe, but - as with the default code page - you may have to situationally change [Console]::OutputEncoding (temporarily) to match a given program's (nonstandard) character encoding.

    • Curiously, it doesn't also set [Console]::InputEncoding to UTF-8, though in practice that setting only matters if you pipe input to PowerShell from the outside. (However, a side effect of not changing this setting is that chcp.com still reports the OEM code page, even though the output code page has in effect been changed to 65001 via [Console]::OutputEncoding).


Your findings imply that the your integrated terminal settings have been customized, such as with:

$OutputEncoding = [Console]::InputEncoding = [Console]::OutputEncoding = [System.Text.UTF8Encoding]::new()

as Santiago points out, which is how you can make a PowerShell console fully UTF-8-aware.

As for where this customization may be applied: Probably in your VSCode-specific profile file, as reflected in $PROFILE, as ninMonkey points out.

mklement0
  • 382,024
  • 64
  • 607
  • 775