The following is all run in PowerShell 3.0 under the standard console as well as with Powershell ISE and using a font that contains the tested unicode codepoint.
The following C# program correctly prints ~
(so we know it can work):
static void Main(string[] args)
{
Console.WriteLine("\u2248");
}
On a sidenote when I look at Console.OutputEncoding
it claims to be codepage IBM850 which certainly can't be true. Even weirder is that independent of what I set the codepage of the console to (using chcp
) the output is fine, so .NET has to worry about the encoding itself (or calling some special APIs?)
Now when I try the following Java program I end up with garbled output ( "H
):
public static void main(String[] args) throws UnsupportedEncodingException {
System.out.println("\u2248");
}
Now that is because Java looks at the system encoding and uses that, which will be windows-1252, so that's as expected, but the following also doesn't work:
public static void main(String[] args) throws UnsupportedEncodingException {
new PrintStream(System.out, true, "UTF-16").println("\u2248");
}
What I can do is to use UTF-8 and call chcp 65001
beforehand. This works and then shows the right glyph, but has a bug where some characters are repeated at the end of the line: Printing \u2248weird.
results in ≈weird.d.
so this is not great either.
So what encoding is C# using to write to the console, or more generally how can I get Java to correctly output Unicode in PowerShell?