3

I try to print UTF-8 string to windows console. The code page of console is set to 65001 (utf-8), the font is set to lucida console and the c++ source file encoding is utf-8 without bom. Consider the following code:

#include<iostream>
#include<locale>
#include<clocale>

int main(int narg, char** arg){
    using namespace std;
    cout<<"C++ locale: "<<cout.getloc().name()
        <<"\nC locale: "<<setlocale(LC_ALL, 0)<<"\n";
    cout<<"中文\n";
    printf("中文\n");
    return 0;
}

The output is:

C++ locale: C
C locale: C
������
中文

Could anybody explain it and give a solution (Make c++ and c have the same correct output.)? Thanks very much.

System: win7(32 bit)

Compiler: vs2012 express

Edit: The program is correct with gcc under ubuntu12.

cqdjyy01234
  • 1,180
  • 10
  • 20

1 Answers1

0

Console by default will not show UTF, however you can use:

chcp 65001 to change the console to UTF or change it via code by using SetConsoleOutputCP

Hope these help:) Addendum: sorry, missed that bit initially! The only way I can get the ?diamond symbol to appear is by using my second machine which has no international fonts present. I had to manually add the consolas font to the registry, however there are serious problems displaying utf character sets in the windows console. In my windows 2003 machine I had to do the following: Start -> Control Panel -> Regional and Language Options -> Advanced -> Language for non-Unicode programs -> Chinese

GMasucci
  • 2,834
  • 22
  • 42
  • 1
    I have changed the code page of console to utf-8.(Read the question) – cqdjyy01234 Apr 11 '13 at 08:56
  • Thank you! I think the problem is due to the implementation of cout by vs, Since everything is fine with gcc (ubuntu). – cqdjyy01234 Apr 12 '13 at 00:28
  • 2
    if you pipe the output of your code to a text file and open it in a utf capable viewer you should see the right output: ie: `mytestprintprog.exe > c:\test.txt` That should let you see if the program is outputting the correct symbols regardless of what the console displays. It is the windows console that causes problems, possibly MSoft thinks of it as a legacy tool of sorts, because it has not the full functionality one might want/expect coming from linux to windows. – GMasucci Apr 12 '13 at 10:23
  • 1
    the other thing you can try is one of the console replacements which support utf more easily, like [console 2](http://sourceforge.net/projects/console/) for example, although this will solve the problem for you, it will still leave the problem with the Msoft console. I will dig around and see if I have any decent docs/books on the MSoft console and report back when I have something that wil lhelp to display utf characters in the standard console so you can make your program display correctly on an out-of-the-box system. – GMasucci Apr 15 '13 at 09:30
  • [ConEmu](https://code.google.com/p/conemu-maximus5/) is a more maintained shell wrapper which will wrap the cmd console or just about any other you like to use as well. – GMasucci Apr 15 '13 at 09:45
  • @user1535111 I hope you've had some success. I was going to give out any and all tips I could think of, however I found this page [here](http://stackoverflow.com/questions/3780378/how-to-display-japanese-kanji-inside-a-cmd-window-under-windows), and think it may help you: it is stated for Kanji characters however the same principles will apply for any language with special characters in the font, just substitute Kanji/Japanese with whatever language you wish to use and it should be fine. Good luck &let me know how you get on, as I know how infuriating console work can be in windows! – GMasucci Apr 24 '13 at 08:26
  • I think my problem is different from [that page](http://stackoverflow.com/questions/3780378/how-to-display-japanese-kanji-inside-a-cmd-window-under-windows). Whatever, I tested it and failed. In fact, I can display Chinese with 'printf'. So the problem should lie in the implementation of cout by vs. – cqdjyy01234 Apr 24 '13 at 11:28
  • The last thing I can think of is that you are using characters either not in the Lucida font or that they are beyond the UTF-8 range. I think too that cout is not capable of displaying certain characters however printf and wprintf are, so perhaps the easiest solution would be to change couts to printf and its variants? – GMasucci Apr 24 '13 at 13:51