How to output utf8 encoded characters normally in c/c++ console application?

Question

Here's what I'm getting now by wprintf:

1胩?鳧?1敬爄汯?瑳瑡獵猆慴畴??

Is utf8 just not supported by windows?

possible duplicate of [utfcpp and Win32 wide API](http://stackoverflow.com/questions/3329718/utfcpp-and-win32-wide-api) — Hans Passant, Oct 05 '10 at 13:25

RichieHindle · Answer 1 · 2010-10-05T11:05:03.970

6

No, Windows doesn't support printing UTF-8 to the console.

When Windows says "Unicode", it means UTF-16. You need to use MultiByteToWideChar to convert from UTF-8 to UTF-16. Something like this:

char* text = "My UTF-8 text\n";
int len = MultiByteToWideChar(CP_UTF8, 0, text, -1, 0, 0);
wchar_t *unicode_text = new wchar_t[len];
MultiByteToWideChar(CP_UTF8, 0, text, -1, unicode_text, len);
wprintf(L"%s", unicode_text);

edited Oct 05 '10 at 11:05

answered Oct 05 '10 at 11:00

RichieHindle

272,464
47
358
399

@Alan: No, they are both methods of encoding Unicode text into bytes. UTF-16 uses at least 16 bits per character, whereas UTF-8 will use 8 bits per character for most Western characters. UTF-8 is popular on the web and on operating systems other than Windows, while Windows uses UTF-16. They do the same job, in different ways. See http://en.wikipedia.org/wiki/UTF-8 and http://en.wikipedia.org/wiki/UTF-16/UCS-2 for full details. – RichieHindle Oct 06 '10 at 08:23

score 1 · Answer 2 · answered Oct 05 '10 at 11:01

1

wprintf supposed to receive a UTF-16 encoded string. Use the following for conversion:

Use MultiByteToWideChar with CP_UTF8 codepage to do the conversion. (and don't do blind casting from char* into wchar_t*).

answered Oct 05 '10 at 11:01

valdo

12,632
2
37
67

It is supposed to receive UTF-16 only on Windows (which is actually against the standard, cause UTF-16 is a variable length encoding). Most other platforms expect UTF-32. – Šimon Tóth Oct 05 '10 at 11:17

How to output utf8 encoded characters normally in c/c++ console application?

2 Answers2