4

What I'm trying to do is converting a string's bytes into hexadecimal format.
Based on this answer (and many others consistent) I've tried the code:

#include <sstream>
#include <iomanip>
#include <iostream>

int main ()
{
   std::string inputText = u8"A7°";

   std::stringstream ss;
   // print every char of the string as hex on 2 values
   for (unsigned int i = 0; i < inputText.size (); ++i)
   {
      ss << std::hex << std::setfill ('0') << std::setw (2) << (int) inputText[i];
   }

   std::cout << ss.str() << std::endl;
}

but with some characters coded in UTF 8 it does't work.
For Instance, in strings containing the degrees symbol ( ° ) coded in UTF8, the result is: ffffffc2ffffffb0 instead of c2b0.
Now I would expect the algorithm to work on individual bytes regardless of their contents and furthermore the result seems to ignore the setw(2) parameter.
Why does I get such a result?

(run test program here)

rickyviking
  • 846
  • 6
  • 17
  • 1
    Post the smallest complete program you can write that compiles, runs, and shows the problem. It should only need about six lines of code. Note that negative values generally have all their high bits set to 1, which in hex shows up as a string of `f`s. – Pete Becker Jan 13 '17 at 11:36

1 Answers1

8

As Pete Becker already hinted in a comment, converting a negative value to a larger integer fills the higher bits with '1'. The solution is to first cast the char to unsigned char before casting it to int:

#include <string>
#include <iostream>
#include <iomanip>

int main()
{
    std::string inputText = "-12°C";
    // print every char of the string as hex on 2 values
    for (unsigned int i = 0; i < inputText.size(); ++i)
    {
       std::cout << std::hex << std::setfill('0')  
                 << std::setw(2) << (int)(unsigned char)inputText[i];
    }
}

setw sets the minimal width, it does not truncate longer values.

alain
  • 11,939
  • 2
  • 31
  • 51