1

I am trying to read Chinese characters from a UTF-8 encoded text file and storing them in variables. When I try to print them in console, it shows question marks in place of the characters.

    while(!fin.eof())
    {
        fin.get(c);
        appendCharacterToWord(currentWord, c);
    } 

(I am working in Windows and the code is in C++)

Kijewski
  • 25,517
  • 12
  • 101
  • 143
SvckG
  • 55
  • 2
  • 9
  • The Windows console can´t work with UTF8 (or any Unicode stuff) properly. Altough some things will work with tricks like cp65001 etc., it´s just buggy down to the core (including the C lib implementation of MS). – deviantfan Jul 02 '15 at 03:27
  • (see eg. https://social.msdn.microsoft.com/Forums/vstudio/en-US/e4b91f49-6f60-4ffe-887a-e18e39250905/possible-bugs-in-writefile-and-crt-unicode-issues?forum=vcgeneral what kind of problems happen with CP65001. This was reported 2010, and still... bug report marked as "not solvable") – deviantfan Jul 02 '15 at 03:37
  • Windows consoles were based on UCS2, effectively UTF-16 limited to the Basic Multilingual Plane. Not all Chinese characters are in the BMP, and so cannot be displayed by default *even when you do things correctly*. It also depends on the font. That said, in May I finally had enough of these questions about displaying characters in Windows, and wrote up [a FAQ-like item about how to do it](http://stackoverflow.com/questions/30197758/how-can-i-make-unicode-iostream-i-o-work-in-both-windows-and-unix-land). Due to the Chinese aspect this question isn't an exact duplicate, but nearly. – Cheers and hth. - Alf Jul 02 '15 at 03:37
  • Thanks. I will check the links. – SvckG Jul 02 '15 at 06:11

0 Answers0