17

Possible Duplicate:
C# Why can equal decimals produce unequal hash values?

I've come across an issue in my .NET 3.5 application (x86 or x64, I've tried both) where decimals with a different number of trailing zeros have different hash codes. For example:

decimal x = 3575.000000000000000000M;
decimal y = 3575.0000000000000000000M;

Console.WriteLine(x.GetHashCode());
Console.WriteLine(y.GetHashCode());
Console.WriteLine(x == y);
Console.WriteLine(x.GetHashCode() == y.GetHashCode());

Outputs the following on my machine:

1085009409
1085009408
True
False

I presume the difference in hash codes is down to the different internal representations of the two numbers caused by the differing scale factors.

Whilst I can work around the issue by removing the trailing zeros I always assumed that GetHashCode should return the same value for x and y, if x == y. Is this assumption wrong, or is this a problem with Decimal.GetHashCode?

EDIT: To be clear on versions I'm using Visual Studio 2008 SP1, .NET 3.5.

Community
  • 1
  • 1
MrKWatkins
  • 2,621
  • 1
  • 21
  • 34
  • 1
    Is this your actual code? This returns `1085009408, 1085009408, True True` for me. - Edit: that was .NET 4, different results on .NET 3.5 confirmed. – CodeCaster Jul 02 '12 at 16:48
  • I got the same output as the OP on .NET 3.5. @CodeCaster, what version are you running it on? – Servy Jul 02 '12 at 16:50
  • 1
    So I went and looked at the bits of the decimal and they are different for x and y (with .NET pre 3.5). Clearly the `Equals` method accounts for this difference, but the `GetHashCode` does not. – Servy Jul 02 '12 at 17:03
  • @CodeCaster Yes it is a duplicate of that by the looks of things. I did search for similar questions before posting but missed that one, good find. – MrKWatkins Jul 03 '12 at 08:50

2 Answers2

13

This is a problem with Decimal.GetHashCode, for .NET Framework version 3.5 and lower. When two values are considered equal, they must return the same hash code, per the guidelines; in this case, decimal clearly does not. You should always expect two equal objects to have the same hash code.

Per MSDN:

If two objects compare as equal, the GetHashCode method for each object must return the same value.

Reproducing

I have tried your exact code against different versions of the .NET Framework, and the results are:

╔══════════════════╤══════════════════╗
║Framework version │ Hashcode equal ? ║
╟──────────────────┼──────────────────╢
║      2.0         │  No.             ║
║      3.0         │  No.             ║
║      3.5         │  No.             ║
║      4.0         │  Yes.            ║
║      4.5         │  Yes.            ║
╚══════════════════╧══════════════════╝

In other words, it seems you stumbled upon a bug in the .NET framework, that was fixed with .NET Framework 4.

The above results was reached using Visual Studio 2012 RC, using the property pages to switch the framework.

Microsoft acknowledges the bug here.

Joey
  • 344,408
  • 85
  • 689
  • 683
driis
  • 161,458
  • 45
  • 265
  • 341
  • I saw those docs too so I assumed it must be a bug, but thought I'd get some other opinions on here first; bugs in .NET are fairly rare after all! – MrKWatkins Jul 02 '12 at 16:58
  • It actually looks like it is a bug, that is fixed with the current version; see update to the answer. – driis Jul 02 '12 at 17:02
  • Thanks for your work here, much appreciated. Another reason to upgrade to .NET 4... – MrKWatkins Jul 03 '12 at 08:48
  • 1
    @MrKWatkins Although the problem is solved for these particular numbers, if you check the "duplicate" question linked at the top, you will see that the problem still occurs (with some specific numbers) in .NET 4.0 and .NET 4.5. – Jeppe Stig Nielsen Sep 19 '12 at 19:17
  • @Jeppe Stig Nielsen Thanks for pointing that out, good to know I can't remove my fix (remove the trailing zeros) if we upgrade to .NET 4.x. – MrKWatkins Oct 12 '12 at 08:56
9

This was a fairly infamous bug in .NET versions prior to .NET 4. The Decimal.GetHashCode() implementation had a dependency on the bit values in the decimal value. They are different since decimal tracks the number of known digits in the fraction. Something you can see by using Decimal.GetBits() on the values. It is actually debatable whether this is a bug, the decimals do have different values, depending on what kind of glasses you wear.

Nevertheless, Microsoft agreed that this was unintuitive behavior and fixed it in .NET 4, the relevant feedback article is here.

Community
  • 1
  • 1
Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • I noticed the problem due to playing about with GetBits; from that point of view they're definitely very different, which has been causing me a few headaches... I didn't notice the GetHashCode problem for a while though as for a lot of trailing zeros they do produce the same hash code... For example 3575 with 0 -> 17 or 19 -> 22 trailing zeros all give the same hash code yet have quite different GetBits() results. Would love to see the actual GetHashCode implementation so I could work out why... – MrKWatkins Jul 03 '12 at 08:47
  • 1
    Download SSCLI20, clr/src/vm/comdecimal.cpp source code file, COMDecimal::GetHashCode() function. – Hans Passant Jul 03 '12 at 12:23
  • 1
    The fact that `Decimal.Equals()` indicates that the numbers are equal but `GetHashCode` indicates that they are unequal indicates that either `Decimal`'s override of `Object.Equals` or `GetHashCode` is faulty. Personally, I would regard the fault as being with the override of `Object.Equals`, since 1.0m and 1.00m aren't really any more alike than "CAT" and "cat". There are situations where one would want to use a custom comparer that mapped them as equal (likewise with "CAT" and "cat"), but IMHO `Object.Equals` should imply strong semantic equivalence, such that two immutable objects... – supercat Aug 27 '12 at 17:12
  • ...whose members all report `Equal` could be regarded as interchangeable (so that if nothing compared about reference equality, code could abandon one and use the other every place the former was used). Because it takes much less time to call `Equals` on two references that point to the same object than to compare distinct objects that hold identical values, such substitution can impart a major performance boost when using such types. `Object.Equals` would seem the most universal comparison method for interning; too bad some types (e.g. `Decimal`) use it for something other than equivalence. – supercat Aug 27 '12 at 17:22