15

Possible Duplicate:
How do I create a HashCode in .net (c#) for a string that is safe to store in a database?

I use C# 4.0 and gets the string hash by invoking:

"my string".GetHashCode()

Code generated by this call is stored into database to future use. This hash code is used to find some subset of strings and then to equal comparison.

Questions are:

  1. Is it a standardized hash calculation? May I assume that it is possible to calculate the same hash in different environments like C# in .Net 3.0 or future .Net editions?
  2. Is it possible to calculate the same hash function on yourself by writing it in Java, PL/SQL, Ruby, etc?
  3. Can I assume that hash generated today will be the same tomorrow in the same environment? For example when I shutdown my computer and run the program again, or change locale, or some other settings?
  4. What are the limits of portability?
  5. I know I can do it yourself, but maybe some kind of portability is provided?
Community
  • 1
  • 1
Max
  • 842
  • 5
  • 16
  • 13
    The answers to your questions are NO, NO, NO, NO, NO, there is no "portability" whatsoever, and there is no "portability" whatsoever. **Under absolutely no circumstances should you be doing what you are describing.** – Eric Lippert Nov 07 '11 at 18:33

4 Answers4

19

From MSDN:

The default implementation of the GetHashCode method does not guarantee unique return values for different objects. Furthermore, the .NET Framework does not guarantee the default implementation of the GetHashCode method, and the value it returns will be the same between different versions of the .NET Framework. Consequently, the default implementation of this method must not be used as a unique object identifier for hashing purposes.

So no, you cannot assume that the value produced by GetHashCode is stable. This isn't just theoretical, either - we've seen the value change in the past.

If you want a stable hash, you'll have to generate it yourself.

Michael Petrotta
  • 59,888
  • 27
  • 145
  • 179
  • "The default implementation" - does `String` use the default implementation? I honestly don't know, I just wouldn't expect it (as they are treated by value in hash tables). –  Nov 06 '11 at 19:04
  • 3
    `System.String` overrides `GetHashCode`, and contains a similar note in [its documentation](http://msdn.microsoft.com/en-us/library/system.string.gethashcode.aspx): *The behavior of GetHashCode is dependent on its implementation, which might change from one version of the common language runtime to another. A reason why this might happen is to improve the performance of GetHashCode.*, and *The value returned by GetHashCode is platform-dependent. It differs on the 32-bit and 64-bit versions of the .NET Framework.* – Michael Petrotta Nov 06 '11 at 19:06
18

Rule: Consumers of GetHashCode cannot rely upon it being stable over time or across appdomains.

Raymond Chen
  • 44,448
  • 11
  • 96
  • 135
3

http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx

the .NET Framework does not guarantee the default implementation of the GetHashCode method, and the value it returns will be the same between different versions of the .NET Framework. Consequently, the default implementation of this method must not be used as a unique object identifier for hashing purposes.

Andrey Sboev
  • 7,454
  • 1
  • 20
  • 37
2

No. It is not portable. You should never use this method for anything other than balancing a hash tree. it's implementation has changed between versions of the Framework, and behaves differently for 32-bit / 64-bit CLR.

Eric Lippert has a blog post on rules and proper uses for this function.

Instead, you should be using SHA1Managed for inserting a hash into the database.

vcsjones
  • 138,677
  • 31
  • 291
  • 286