I'm trying to use C++11 u8
, u
and U
literals to encode this emoji:
http://www.fileformat.info/info/unicode/char/1f601/index.htm
Now, I'm using the hex values for each encoding to save it:
const char* utf8string = u8"\xF0\x9F\x98\x81";
const char16_t* utf16string = u"\xD83D\xDE01";
const char32_t* utf32string = U"\x0001F601";
This works fine in GCC 6.2 and Clang 3.8, each string has a length of 4, 2 and 1 respectively. But in Visual Studio 2015 compiler it has length of 8, 2, and 1 respectively.
I'm using this code to get the length of each string:
#include <iostream>
#include <cwchar>
int main() {
const char* smiley8 = u8"\xF0\x9F\x98\x81";
const char16_t* smiley16 = u"\xD83D\xDE01";
const char32_t* smiley32 = U"\x0001F601";
auto smiley8_it = smiley8;
while ((*++smiley8_it) != 0);
auto smiley16_it = smiley16;
while ((*++smiley16_it) != 0);
auto smiley32_it = smiley32;
while ((*++smiley32_it) != 0);
size_t smiley8_size = smiley8_it - smiley8;
size_t smiley16_size = smiley16_it - smiley16;
size_t smiley32_size = smiley32_it - smiley32;
std::cout << smiley8_size << std::endl;
std::cout << smiley16_size << std::endl;
std::cout << smiley32_size << std::endl;
}
I also test the UTF-8 string using std::strlen
.
Any clues why this happens?