0

I need a way to convert chars into hex values as strings.

I've tried a few ways but all of them just ignored UTF8 characters.

For example:

Take character:

Ş

If its converted correctly, its hex value is 0x15E but this code just returns me 0x3F which is just character ?.

wchar_t mychar = 'Ş';
cout << hex << setw(2) << setfill('0') 
                  << static_cast<unsigned int>(mychar);

I've found a javascript function which exactly what i need but couldn't convert it into c++ Here

Thanks

Community
  • 1
  • 1
Arefi Clayton
  • 841
  • 2
  • 10
  • 19

2 Answers2

1

The problem is that you are assigning a char literal to wchar_t mychar. Because char is only one byte long it cannot store the character Ş. You have to prefix the literal with L, like this:

wchar_t mychar = L'Ş';

A very good article about Unicode, encodings, etc. is The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) by Joel Spolsky.

mrbraitwistle
  • 414
  • 4
  • 5
  • Thank you! Indeec a great article. Got it figured now thanks again. – Arefi Clayton Jan 02 '16 at 15:30
  • Note that `mychar` will be encoded as UTF16 or UTF32, depending on compiler and platform. In UTF8, Unicode codepoint U+015E is `0xC5 0x9E`. There are many ways of converting a `wchar_t` to UTF8, either at compile time or runtime, depending on compiler and libraries used. – Remy Lebeau Jan 02 '16 at 15:48
0

Even if you prefix them with L, type wchar_t does not handle international character sets very well.

Try this:

char16_t mychar16 {u'Ş'}; // Initialized with UTF-16 code
char32_t mychar32 {U'Ş'}; // Initialized with UTF-32 code

cout << showbase << hex << setw(12) << setfill('0')
     << std::setiosflags(ios::left | ios::hex) 
     << static_cast<unsigned int>(mychar16) << endl;

Result:

0x15e0000000

Character encoding that applies with type wchar_t is implementation defined, so it can vary from one compiler to another. Types char16_t and char32_t are better for handling Unicode characters.

Marko Tunjic
  • 1,811
  • 1
  • 14
  • 15