0

I have a use case similar to that described in this article: Calculate MD5 checksum for a file

Since the article apparently refers to .NET Framework and I use .NET core, I wanted to find out more about hashes in .NET core. After a little research about I came across a statement in Microsoft's System.String documentation. They say the following about the GetHashCode() function:

The hash code itself is not guaranteed to be stable. Hash codes for identical strings can differ across .NET implementations, across .NET versions, and across .NET platforms (such as 32-bit and 64-bit) for a single version of .NET. In some cases, they can even differ by application domain. This implies that two subsequent runs of the same program may return different hash codes.

Since I am not very familiar with hashes and the behavior occurs only in .NET core, the above statement worried me. Can an MD5 hash be used to store a checksum in the database and is this checksum calculated the same on every platform? Or is it like GetHashCode() and the result can be different on different platforms or even with each program execution?

onestarblack
  • 774
  • 7
  • 21
  • And how `object.GetHashCode()` is related to MD5 hash ? `GetHashCode()` in C# and `hashCode()` in Java are for use in colections like Dictionary/Maps – Selvin Jan 28 '20 at 11:52
  • 2
    A hash code is just a general name for any algorithm that turns a potentially unlimited number of bits into a smaller sized, fixed number of bits. The MD5 algorithm is completely unrelated to what the `.GetHashCode()` method does -- and yes, it's deterministic. (Note that MD5 is well-known to be cryptographically weak, so it can't be used to detect *deliberate* tampering with a file, and crypto hashes in general weren't designed to do error detection/correction for transmission purposes. Different algorithms for different things.) – Jeroen Mostert Jan 28 '20 at 11:55
  • 1
    Stable ≠ deterministic. *All* hashes are deterministic, otherwise they are not hashes, by definition. Furthermore, *MD5* is a specific hashing function and, as such, is also stable. But `object.GetHashCode` obviously doesn’t use MD5 in general. – Konrad Rudolph Jan 28 '20 at 11:57
  • @JeroenMostert, KonradRudolph Thanks a lot, that was exactly the information I needed! – onestarblack Jan 28 '20 at 12:00
  • Hash codes are not unique. Multiple inputs can produce the same output. That is why the IEquatable has both a Hash methods and a Compare method. First the hash is tested for equality and then the Equal method is called. See : https://learn.microsoft.com/en-us/dotnet/api/system.iequatable-1.equals?view=netframework-4.8 – jdweng Jan 28 '20 at 12:51
  • @JeroenMostert - You should make this an answer, this is a good explanation. – martinstoeckli Jan 28 '20 at 13:01

0 Answers0