0

Im working on converting an encryption library written in PHP over to C#, and have a small problem. When converting a HEX string to a string in PHP, and im getting a different value then my C# code which is supposed to be doing the exact same thing.

Here is the php code im using:

public function hex2str($hex)
{
    $str = '';
    for($i=0; $i<strlen($hex); $i+=2)
    {
        $str.=chr(hexdec(substr($hex, $i, 2)));
    }
    return $str;
}

And my C# code:

public static string Hex2Str(string hexString)
{
    char[] mychar = new char[hexString.Length / 2];
    for (var i = 0; i < mychar.Length; i++)
    {
        // Convert the number expressed in base-16 to an integer. 
        int value = Convert.ToInt32(hexString.Substring(i * 2, 2), 16);
        string stringValue = Char.ConvertFromUtf32(value);
        mychar[i] = (char)value;
    }

    return new String(mychar);
}

The Hex value im using is:

E0D644FCDEB4CCA04D51F617D59084D8

And here is a picture of the difference between the PHP script and my C# scripts return value:

enter image description here

If anyone can help me spot my mistake in the C# code, i would greatly appreciate your help!

Wilson212
  • 543
  • 6
  • 17
  • duplicate: http://stackoverflow.com/questions/14674834/php-convert-string-to-hex-and-hex-to-string – Kamil Karkus Feb 22 '15 at 17:14
  • The duplicate post you linked is all about php code and converting back and forth between hex and string... my question is strictly related to C#, and trying to mimic my PHP code in a one way conversion to a string. – Wilson212 Feb 22 '15 at 17:20
  • my bad, I'm tired, sorry – Kamil Karkus Feb 22 '15 at 17:21
  • How do you get the PHP string into c# to print it? 'α' has a hex value of `3B1` so it can't possibly be represented by a two-byte value. `E0` has a value of 224 however, which happens to be the "OEM United States" encoding of 'α' -- maybe you're decoding the PHP string wrongly? – dbc Feb 22 '15 at 17:57
  • i basically run php in a new process and capture the output like so: http://pastebin.com/d1qxinvj – Wilson212 Feb 22 '15 at 18:00

2 Answers2

1

The difference is in the different code pages used. It seems that PHP used OEM charset (codepage 850), which is still used with the default command line.

You can try this:

public static string Hex2Str(string hexString)
{
    byte[] myBytes = new byte[hexString.Length / 2];
    for (var i = 0; i < myBytes.Length; i++)
    {
        // Convert the number expressed in base-16 to an integer. 
        int value = Convert.ToInt32(hexString.Substring(i * 2, 2), 16);

        myBytes[i] = (byte)value;

    }
    return Encoding.GetEncoding(850).GetString(myBytes);
}

Be aware that the real used encoding on your machine is dependent on the local settings.

You are also able to change the codepage or use one of the standard encodings:

return Encoding.Default.GetString(myBytes);

This one will probably give you the initial result of your first try.

Also note that presumably using php to write to a file give you another result than printing to stdio in the commandline.

Rainer Schaack
  • 1,558
  • 13
  • 16
  • I really appreciate your answer! though its just not quite it :( . Here was the output using your Hex2Str: http://puu.sh/g8apE/ec07b685a5.png – Wilson212 Feb 22 '15 at 19:06
  • I used code 437 instead of your 850 and it worked! thank you so much! – Wilson212 Feb 22 '15 at 19:10
1

I think your c# algorithm Hex2Str looks good, though I might suggest the following small change to avoid any possible inconsistencies with surrogate pair encoding:

    public static string Hex2Str(string hexString)
    {
        var sb = new StringBuilder();

        var len = hexString.Length / 2;
        for (var i = 0; i < len; i++)
        {
            // Convert the number expressed in base-16 to an integer. 
            int value = Convert.ToInt32(hexString.Substring(i * 2, 2), 16);
            string stringValue = Char.ConvertFromUtf32(value);
            sb.Append(stringValue);
        }

        return sb.ToString();
    }

The real problem here, I suspect, is that the string from PHP is being mangled when passed through the console due to inconsistent encodings. For instance, if the PHP console has Latin 9 (ISO) encoding and your input console has OEM United States encoding (which it is on my computer) then 'à' will be transformed to 'α'.

Instead, I recommend taking the additional step of encoding your PHP string in Base64 using base64_encode before writing it to the console. This will guarantee a pure ASCII representation as it is passed through the console. Then decode as follows:

    public static string FromPHPBase64String(string phpString)
    {
        var bytes = Convert.FromBase64String(phpString);
        var sb = new StringBuilder();
        foreach (var b in bytes)
        {
            string stringValue = char.ConvertFromUtf32(b);
            sb.Append(stringValue);
        }
        return sb.ToString();
    }

I believe everything should now match.

live2
  • 3,771
  • 2
  • 37
  • 46
dbc
  • 104,963
  • 20
  • 228
  • 340
  • By converting to a Base64 string, the default encoding from php (ASCII) was preserved normally, and now my strings match... Great suggestion – Wilson212 Feb 22 '15 at 21:22