I am reading the well-known answer about string and wstring and come up some confusion.
source charset and execution charset are all set as utf-8, Windows x64, VC++ compiler, git bash console (can print unicode characters), system default codepage 936(GB2312).
My expertiment code:
#include <cstring>
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
wchar_t c[] = L"olé";
wchar_t d[] = L"abc";
wcout << c << endl;
wcout << d << endl;
return 0;
}
Can print "abc" but can't print "é".
I understand that wchar_t
is used along with L
prefix string literal. And under Windows wchar_t
is encoded with UTF-16(It's hard coded right? No matter what source charset or execution charset I choose, L"abc"
would always have the same UTF-16 code units).
The question is:How can it wcout
a UTF-16 encoded string("abc"), while my source file is utf-8 and execution charset is utf-8. The program should not be able to recognize UTF-16 encoded stuff unless I set everything to utf-16.
And if it can print UTF-16 in some way, then why can't it print é
?