4

I found this informative thread:

C# solution to de-/encode a unicode string:

How do you convert Byte Array to Hexadecimal String, and vice versa?

Javascript solution for de-/encode a unicode string:

Javascript: Unicode string to hex

But the solutions mix the chars.

Example Javascript (code 1:1 from link above):

var str = "그러하지";
hex = str.hexEncode(); // returns "adf8b7ecd558c9c0"

Example C# (tried 2 solutions, same results):

/// <summary>
/// Convert a string to hex value
/// </summary>
/// <param name="stringValue"></param>
/// <returns></returns>
public string HexEncode(string stringValue)
{
  var ba = Encoding.Unicode.GetBytes(stringValue);
  // SOLUTION 1
  //var c = new char[ba.Length * 2];
  //for (var i = 0; i < ba.Length; i++)
  //{
  //  var b = ba[i] >> 4;
  //  c[i * 2] = (char)(55 + b + (((b - 10) >> 31) & -7));
  //  b = ba[i] & 0xF;
  //  c[i * 2 + 1] = (char)(55 + b + (((b - 10) >> 31) & -7));
  //}
  //return new string(c);
  // SOLUTION 2
  var hex = new StringBuilder(ba.Length * 2);
  foreach (var b in ba)
    hex.AppendFormat("{0:x2}", b);
  return hex.ToString();
}

/// <summary>
/// Converts a hex value to a string
/// </summary>
/// <param name="hexString"></param>
/// <returns></returns>
public string HexDecode(string hexString)
{
  if (hexString == null || (hexString.Length & 1) == 1) return "";
  // SOLUTION 1
  //hexString = hexString.ToUpper();
  //var hexStringLength = hexString.Length;
  //var b = new byte[hexStringLength / 2];
  //for (var i = 0; i < hexStringLength; i += 2)
  //{
  //  var topChar = (hexString[i] > 0x40 ? hexString[i] - 0x37 : hexString[i] - 0x30) << 4;
  //  var bottomChar = hexString[i + 1] > 0x40 ? hexString[i + 1] - 0x37 : hexString[i + 1] - 0x30;
  //  b[i / 2] = Convert.ToByte(topChar + bottomChar);
  //}
  // SOLUTION 2
  var numberChars = hexString.Length;
  var bytes = new byte[numberChars / 2];
  for (var i = 0; i < numberChars; i += 2)
    bytes[i / 2] = Convert.ToByte(hexString.Substring(i, 2), 16);
  return Encoding.Unicode.GetString(bytes);
}


var hex = tools.HexEncode("그러하지");
var str = tools.HexDecode(hex); // f8adecb758d5c0c9
  • JS: adf8 b7ec d558 c9c0
  • C#: f8ad ecb7 58d5 c0c9

So the sequence is exchanged. Both encode and decode works as long I am in the same environment. But I need to encode in JS and decode in C# and vice versa.

I do not know which one is the correct one, if correct can be defined here. And how do I fix this?

Community
  • 1
  • 1
YvesR
  • 5,922
  • 6
  • 43
  • 70

1 Answers1

2

Both values are correct. It's just that your javascript solution gives you unicode array in Big Endian notation, and C# - in Little Endian (MSDN article, see Remarks section). To make C# byte array same like your javascript, define your encoding like this:

UnicodeEncoding bigEndianUnicode = new UnicodeEncoding(true, true);

And later use it like this:

var ba = bigEndianUnicode.GetBytes(stringValue);

Demo: .Net Fiddle

Ilya Luzyanin
  • 7,910
  • 4
  • 29
  • 49