I have a couple of cases where I want to print Unicode characters in a Windows console, programming in both C# and Visual Basic. Chinese is turning out to be a problem.
I've tried finding answers on MSDN and on StackOverflow here and here. And I've read Joel on Software, too. But I think I read that article back in 2003, anyway, and it's not all that relevant to my problem today.
Here's some example code; it's practically the same as what is proposed by Microsoft but with a slight modification to display Chinese and Greek.
using System;
using System.Text;
class Example
{
static void Main()
{
string unicodeString = "This string contains some Chinese (福更斯)";
string unicodeString2 = "This string contains some Greek (Αλφάβητο)";
// Create two different encodings.
Encoding ascii = Encoding.ASCII;
Encoding unicode = Encoding.Unicode;
// Convert the strings into byte arrays.
byte[] unicodeBytes = unicode.GetBytes(unicodeString);
byte[] unicodeBytes2 = unicode.GetBytes(unicodeString2);
// Perform the conversion from one encoding to the other.
byte[] asciiBytes = Encoding.Convert(unicode, ascii, unicodeBytes);
byte[] asciiBytes2 = Encoding.Convert(unicode, ascii, unicodeBytes2);
// Convert the new byte[] into a char[] and then into a string.
char[] asciiChars = new char[ascii.GetCharCount(asciiBytes, 0, asciiBytes.Length)];
ascii.GetChars(asciiBytes, 0, asciiBytes.Length, asciiChars, 0);
string asciiString = new string(asciiChars);
char[] asciiChars2 = new char[ascii.GetCharCount(asciiBytes2, 0, asciiBytes2.Length)];
ascii.GetChars(asciiBytes2, 0, asciiBytes2.Length, asciiChars2, 0);
string asciiString2 = new string(asciiChars2);
// Display the strings created before and after the conversion.
Console.WriteLine("Original string: {0}", unicodeString);
Console.WriteLine("Ascii converted string: {0}", asciiString);
Console.WriteLine("Original string: {0}", unicodeString2);
Console.WriteLine("Ascii converted string: {0}", asciiString2);
}
}
// The example displays the following output:
// Original string: This string contains some Chinese (福更斯)
// Ascii converted string: This string contains some Chinese (???)
// Original string: This string contains some Greek(Αλφάβητο)
// Ascii converted string: This string contains some Greek(????????)
(adapted from https://learn.microsoft.com/en-us/dotnet/api/system.text.encoding?view=netframework-4.7)
I compile this in Visual Studio 2017, targetting .NET 4.5.2. I start a console with cmd /K chcp 65001. When I run the .exe, this is what I see:
Original string: This string contains some Chinese (☒☒☒)
Ascii converted string: This string contains some Chinese (???)
Original string: This string contains some Greek (Αλφάβητο)
Ascii converted string: This string contains some Greek (????????)
Here, I'm using ☒ to represent what I see in the console, which is a question mark in a box. If I copy what I see in the console, I get Chinese...
The console uses Consolas, and in Notepad this font displays the Chinese correctly. But in the console, to see the Chinese correctly I need to switch the font to MS Gothic or NSimSum, but in that case, the Greek is horrible (characters are spaced further apart than they should be).
Is there a simple and reliable way of getting the console to display Chinese?
When the code is finished, it will be distributed as an example to third parties, so I'll also need to be able to explain a simple solution for correctly displaying the text, that should work in almost 100% of cases (Windows 10, Visual Studio 2017, .NET 4.5 and later) without needing to download and install extra components (extra fonts, for instance).