3

for my employer I have to present customers of a web-app with checksums for certain files they download.

I'd like to present the user with the hash their client tools are also likely to generate, hence I have been comparing online hashing tools. My question is regarding their form of hashing, since they differ, strangely enough.

After a quick search I tested with 5:

  1. http://www.convertstring.com/Hash/SHA256
  2. http://www.freeformatter.com/sha256-generator.html#ad-output
  3. http://online-encoder.com/sha256-encoder-decoder.html
  4. http://www.xorbin.com/tools/sha256-hash-calculator
  5. http://www.everpassword.com/sha-256-generator

Entering the value 'test' (without 'enter' after it) all 5 give me the same SHA256 result. However, and here begins the peculiar thing, when I enter the value 'test[enter]test' (so two lines) online tool 1, 2 and 3 give me the same SHA256 hash, and site 4 and 5 give me a different one (so 1, 2 and 3 are equal, and 4 and 5 are equal). This most likely has to do with the way the tool, or underlying code handles \r\n, or at least I think so.

Coincidentally, site 1, 2 and 3 present me with the same hash as my C# code does:

    var sha256Now = ComputeHash(Encoding.UTF8.GetBytes("test\r\ntest"), new SHA256CryptoServiceProvider());

    private static string ComputeHash(byte[] inputBytes, HashAlgorithm algorithm)
    {
        var hashedBytes = algorithm.ComputeHash(inputBytes);
        return BitConverter.ToString(hashedBytes);
    }

The question is: which sites are 'right'?

Is there any way to know if a hash is compliant with the standard?

UPDATE1: Changed the encoding to UTF8. This has no influence on the output hash being created though. Thx @Hans. (because my Encoding.Default is probably Encoding.UTF8)

UPDATE2: Maybe I should expand the question a bit, since it may have been under-explained, sorry. I guess what I am asking is more of a usability question than a technical one; Should I offer all the hashes with different line endings? Or should I stick to one? The client will probably call my company afraid that their file was changed somehow if they have a different way of calculating the hash. How is this usually solved?

Gerben Rampaart
  • 9,853
  • 3
  • 26
  • 32
  • 2
    Which line ending do you consider to be right? Choose from \r, \n or \r\n. Fret about text encoding while you are at it. – Hans Passant Dec 26 '13 at 12:09
  • @Hans, I have no idea what you are trying to tell me. – Gerben Rampaart Dec 26 '13 at 12:10
  • A hash algorithm works on *bytes*. There are lots of different ways to convert strings to bytes, all of them are 'right'. You like Encoding.Default for some reason, you'll regret that some day when your code runs on another machine with a different default. There are a hundred more to choose from. As long as you *have* to pick one that doesn't depend on a machine setting then you can't really go wrong with Encoding.UTF8. – Hans Passant Dec 26 '13 at 12:17
  • 1
    Then I'll use UTF8. Especially since you're guaranteeing I won't get fired. – Gerben Rampaart Dec 26 '13 at 12:18

1 Answers1

2

All those sites return valid values.

Sites 4 and 5 use \n as line break.


EDIT

I see you edited your question to add Encoding.Default.GetBytes in the code example.

This is interesting, because you see there is some string to byte array conversion to run before computing the hash. Line breaking (\n or \r\n) as well as text encoding are both ways to interpret your string to get different bytes values.

Once you have the same bytes as input, all hash results will be identical.


EDIT 2:

If you're dealing with bytes directly, then just compute the hash with those bytes. Don't try to provide different hash values; a hash must only return one value. If your clients have a different hash value than yours, then they are doing it wrong.

That being said, I'm pretty sure it won't ever happen because there isn't any way to misinterpret a byte array.

ken2k
  • 48,145
  • 10
  • 116
  • 176
  • I understand. But presenting the hash to a client makes the hash either 'wrong' or 'right' in their eyes don't you think? Do you think I should present hashes with both \r AND \r\n? I'm fine with that btw, but I have no idea that is the way to go in presenting hashes to clients. – Gerben Rampaart Dec 26 '13 at 12:13
  • @GerbenRampaart If your input are _files_ (as specified in your question), then you won't even deal with line breaking nor text encoding. You just read the files using a stream and pass it to the hash algorithm; you won't have to _interpret_ text. – ken2k Dec 26 '13 at 12:20
  • You're right, that is actually my situation, I reworked the code example to explain my interaction with the online tools better. But you're right, I have a byte[] incoming. That changes nothing about the hash I am getting from the files I have tested to far. I can do File.ReadAllBytes(path); and use the byte[] to compute or do File.ReadAllText(path); and afterward do Encoding.UTF8.GetBytes(). Both have same SHA256 results. – Gerben Rampaart Dec 26 '13 at 12:33
  • Wait, I starting to pick up on what you are saying ... (finally), so the online tools, since they are string based, will differ per platform (line endings) actually because they are string based. If I had some way of feeding those online tools a byte[] we would all actually end up with the same hash? – Gerben Rampaart Dec 26 '13 at 12:43
  • @GerbenRampaart yes, exactly – ken2k Dec 26 '13 at 12:45