Hashed string in C# is not readable

Question

I'm trying to hash the same string in C# and in Java.

C# hash method:

  public static string hashValue (string value)
    {
        byte[] input = null;

        HashAlgorithm digest = HashAlgorithm.Create("SHA-512");
        input = digest.ComputeHash(Encoding.UTF8.GetBytes(value));

        return System.Text.UTF8Encoding.UTF8.GetString(input);
    }

The output, in a WPF TextBox, for this is looking like: "՘"�?N[��"��2��D��j��t!z}7�H�p�J��GƼOp�EnBfHڄ�X���" .

The same function, in Java, is returning the result: "[B@41e2db20".

The Java hash method like this:

    public static String hashValue(String value) {

    byte[] input = null;

    MessageDigest digest;
    try {
        digest = MessageDigest.getInstance("SHA-512");
        try {
            input = digest.digest(value.getBytes("UTF-8")); 

        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        }
    } catch (NoSuchAlgorithmException e1) {
        e1.printStackTrace();
    }

    return input.toString();
}

Can you please let me know what I'm doing wrong? Why is the result looking that weird in C#?

To me the Java version looks wrong. `SHA-512` should return 512-bits => 64 byte => `[B@41e2db20` is to short (see http://en.wikipedia.org/wiki/SHA-2#Examples_of_SHA-2_variants) — Christoph Fink, Jun 18 '14 at 14:15
You shouldn't display hash as a UTF8. It is just a byte array and not a string. See http://stackoverflow.com/a/5340599/706456 for an example. — oleksii, Jun 18 '14 at 14:18
Don't treat hashes as strings - If you want to display it, hex encode it — Mark Peters, Jun 18 '14 at 14:19
I see now what the problem is. Thank you very much for opening my eyes :) — sepo, Jun 18 '14 at 14:25

Duncan Jones · Accepted Answer · 2014-06-18T14:25:18.203

4

Your C# result is looking "weird" because you've converted the random bytes of a hash into a UTF-8 string. That isn't going to result in anything pretty-looking, since many of the byte values will map to unprintable characters.

You may wish to convert the hash to hexadecimal instead. For that, use the DatatypeConverter class:

return DatatypeConverter.printHexBinary(input);

I'm not sure the C# equivalent - but check Google.

For the record, the Java equivalent of your current C# code would be:

return new String(input, "UTF-8");

Currently you are calling .toString(), which for a Java byte array results in a call to the Object.toString() method. This prints the type and hashcode of the object, but not the contents.

edited Jun 18 '14 at 14:25

answered Jun 18 '14 at 14:19

Duncan Jones

67,400
29
193
254

1

It's not only a problem that many of the characters are not printable, but an arbitrary byte array will probably contain invalid UTF-8 sequences. In Java, such sequences are converted to the Unicode character U+FFFD (REPLACEMENT CHARACTER). First of all, this will truncate the hash and defeat it's purpose and you can not be sure that invalid UTF-8 sequences are treated identically in other languages. – jarnbjo Jun 18 '14 at 14:49
@jarnbjo Thanks for the additional info. This explains the question marks appearing within the output: http://www.fileformat.info/info/unicode/char/0fffd/index.htm – Duncan Jones Jun 18 '14 at 14:51

Hashed string in C# is not readable

1 Answers1