1

I am writing some output to the console. I set the console output codepage according to this page, and use _setmode to utf8 according to this answer. This is my code:

SetConsoleOutputCP(CP_UTF8);
_setmode(_fileno(stdout), _O_U8TEXT);   
wstring str = L"Testing unicode -- English -- Ελληνικά -- 中文 -- Español -- けものフレンズ -- abc1234.";
wcout << str << endl;

This piece of code print correctly on my Windows 10 (Traditional Chinese) and a Windows 7(Japanese) machine, but still cannot show unicode text in a Windows 10 (English) machine:

like this

I checked the console codepage which is already 65001(UTF-8), and I tried different fonts, but no changes. How can I display correctly?

UPDATE:

I tried explicitly transforming the string into unicode like:

wstring example = L"Testing unicode -- English -- \u0395\u03bb\u03bb\u03b7\u03bd\u03b9\u03ba\u03ac \u002d\u002d \u4e2d\u6587 \u002d\u002d \u0045\u0073\u0070\u0061\u00f1\u006f\u006c \u002d\u002d \u3051\u3082\u306e\u30d5\u30ec\u30f3\u30ba -- abc1234.";
wcout << example << endl;

And then it will print out correctly (only if I choose "MS Gothic" or "MingLiU" font). enter image description here I wonder why the first string is not correct in English environment?

Dia
  • 851
  • 1
  • 15
  • 35
  • 1
    anything interesting in here: https://github.com/nodejs/node-v0.x-archive/issues/7940 ? maybe double check the fonts you tried actually have the unicode support? There is one listed in that link that supposedly does. – xaxxon Sep 07 '17 at 03:12
  • `wchar_t` stores UTF-16 in Windows. Writing UTF-16 to a console in UTF-8 mode has undefined behavior. – IInspectable Sep 07 '17 at 06:48
  • You linked to the right question but used the wrong answer. The [correct answer](https://stackoverflow.com/a/9051543/7571258) calls `_setmode(_fileno(stdout), _O_U16TEXT);`. Also make sure to select a font for the console that has the Unicode code points available ("MS Gothic" works on my machine). – zett42 Sep 07 '17 at 07:14
  • I tried `_O_U16TEXT`, and tried "MS Gothic" and "DejaVu sans mono" fonts. Still no help yet. – Dia Sep 07 '17 at 07:23
  • i think you must transfer utf-16 to utf-8 using WideCharToMultiByte – Li Kui Sep 07 '17 at 07:59
  • 2
    As the "\uNNNN" encoding works for you, you propably didn't save the source code file with a **Unicode encoding**. In versions of Visual Studio prior 2017, select "File > Advanced Save Options" and select "Unicode (UTF-8 with signature)". In VS2017, File > Save As > then click the down arrow on the Save button and click "Save With Encoding...". – zett42 Sep 07 '17 at 09:33
  • The legacy console codepage is unrelated to using the wide-character API with the C stream in `_O_U16TEXT` mode. Don't change it to 65001. The console's behavior with 65001 is buggy across Windows versions, and setting this codepage will affect other programs that run attached to the current console. The problems seem to all be solved in the latest Windows 10, but not prior to the Creators update and definitely not in older versions of Windows. – Eryk Sun Sep 07 '17 at 19:07

0 Answers0