1

EDITED: I'm writing a function in C++ which should take a hex string and should convert it to unicode string using c++. the conversion that I made using stream is not giving desired output. Could you please suggest a solution?

And If I make it as hardcoded unicode(like "e820" to "\u00e8\u0020") then works, but how to make this at run time?

Example 1:

  • input hex string: 6880d7a5876b2b9b
  • Expected output: h×¥k+
  • current output with result: hÇ╫Ñçk+¢
  • current output with Temp : h×¥k+

Example 2:

  • input hex string: e820b19b7ae506ff
  • Expected output: è ±zåÿ
  • current output with result: Φ ▒¢zσ♠
  • current output with Temp : è ±zåÿ

My code sample:

//Hex string that I have
std::string str( "6880d7a5876b2b9b" );
std::string Temp( "\u00e8\u0020\u00b1\u009b\u007a\u00e5\u0006\u00ff" );
std::string result;
//Take 2 bytes and convert it to ascii
for (int i = 0; i < (int)str.size(); i += 2)
{
    std::istringstream iss(str.substr(i, 2));
    int temp;
    iss >> std::hex >> temp;
    result += static_cast<char>(temp);
}
std::cout << result << std::endl;
std::cout << Temp << std::endl;

After a weeks struggle I was able to find the solution using boost lib: Fix is as below:

  std::string HexString("6880d7a5876b2b9b");      //Hex string as input

  std::string UnicodeString; 
  //Take a char i.e 2 chars in hex string
  for (int i = 0; i < ( int )HexString.size(); i += 2)
  {
    //Convert to int from hex
    std::istringstream iss( HexString.substr( i, 2 ) );
    unsigned int temp;
    iss >> std::hex >> temp;
    //Convert to \uxxx unicode 
    std::string Utf8Char = boost::locale::conv::utf_to_utf<char>( &temp, (&temp + 1) );
    UnicodeString.append( Utf8Char );
  }
  std::cout<<"UnicodeString:"<<UnicodeString<<std::endl
  • The conversion is OK, producing correct string. You are displaying it incorrectly (you have to convert it from "ISO-8859-1" encoding to the correct target encoding, which the output is set to (encoding of `std::cout` in your case). – Ped7g Jul 28 '16 at 14:45
  • I think the duplicate is marked wrong, that question does answer something different, when you are like super precise, still there's probably enough topic to get also Parameshwar on the track, what to study. But answer to this question is "no problem, your code works, you just don't understand what you did and expect wrong (different) thing". ... oh, now I see he did check that answer, and it didn't get him on the track. – Ped7g Jul 29 '16 at 09:46
  • Parameshwar: so, what didn't you understand from my comment? Your solution works. You have correct ISO-8859-1 string in `result`. You just don't know how to display it. To verify this, save it to text file, open the text file in some editor, and switch the editor view encoding to ISO-8859-1, the string will be fine. So you have to specify, what is your real problem. – Ped7g Jul 29 '16 at 09:50
  • If you want to display ISO-8859-1 text correctly, you have to have display device capable of ISO-8859-1 (and switch it to it), or you have to know what is target encoding of display and convert from ISO-8859-1 to the target encoding (may result in incomplete conversion, if the target encoding doesn't support all characters from ISO-8859-1). The `std::cout` display device can be pretty much anything, for example for me it would be linux console with UTF-8 encoding, so I would do ISO-8859-1 to UTF-8 conversion before std::cout. (and I would use some library for that, like iconv/etc.) – Ped7g Jul 29 '16 at 09:52
  • Judging from "current output" you are running it maybe in some obsolete old OS, like MS-DOS or MS Windows. So search for their console capabilities, how to select it's encoding, and what capabilities it has, but those OS are quite limited. – Ped7g Jul 29 '16 at 09:54
  • thanks @Ped7g for all the answers you have given. I figured out the solution using boos lib – Parameshwar Aug 09 '16 at 07:35

0 Answers0