0

I have a Point3d struct which implements IEquatable<Point3d> in the following manner:

public override bool Equals(object obj) {
    return obj is Point3d p && Equals(p);
}

public bool Equals(Point3d other) {
    return Equals(other, Tolerance.ToleranceDecimal);
}

public bool Equals(Point3d other, double tolerance) {
    if (tolerance < 0) throw new ArgumentOutOfRangeException(nameof(tolerance), tolerance, "Expected a tolerance greater than or equal to 0");
    return Math.Abs(X - other.X) <= tolerance && Math.Abs(Y - other.Y) <= tolerance && Math.Abs(Z - other.Z) <= tolerance;
}

public override int GetHashCode() {
    var hash = 17;
    hash = hash * 23 + X.GetHashCode();
    hash = hash * 23 + Y.GetHashCode();
    hash = hash * 23 + Z.GetHashCode();
    return hash;
}

public static bool operator ==(Point3d firstPoint, Point3d secondPoint) {
    return firstPoint.Equals(secondPoint);
}

public static bool operator !=(Point3d firstPoint, Point3d secondPoint) {
    return !(firstPoint == secondPoint);
}

This is already heavily used within an application with the expectation that checking equality between two points allows for a tolerance (which is necessary for the implementation to work correctly).

If has come to my attention that the Equals and GetHashCode methods are not consistent, and indeed it is not possible to write GetHashCode in a form that would produce good and consistent results. This issue is particularly problematic in situations where a Linq query is used, such as points.Distinct() as the resulting points may be considered Equal if comparing directly, such as points[0] == points[1]

Personally I believe the best option would be to change Equals as follows, such that it's behaviour is consistent with GetHashCode:

public bool Equals(Point3d other) {
    return Equals(other, 0);
}

However, as this is already heavily utilised within an application this would be a significant breaking change. I believe it is the wrong thing to do, but I'm considering changing GetHashCode to:

public override int GetHashCode() {
    return 0;
}

My understanding is that the above would force the Equals method to be utilised which would result in a performance hit, but also allow for points within a tolerance to be considered equal within Linq queries. I would like to know if this opens me up to any other potential pitfalls.

I'm unsure as to what other avenues may be available to me, so am very much looking for advice as to the best approach to resolve this issue.

Thanks in advance!

Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
Rob Kite
  • 395
  • 1
  • 11
  • 1
    Look here: https://stackoverflow.com/questions/41492345/using-iequalitycomparer-gethashcode-with-a-tolerance – MistyK Jul 24 '18 at 20:37
  • Ok so I get the message that I should change the equals method such that it does not allow for a tolerance. What I’m still missing is the downside (besides performance) with having GetHashCode just return 0 and forcing the Equals method to be used with a tolerance. – Rob Kite Jul 24 '18 at 20:53
  • @RobKite What, specifically, will happen when you violate the contract of that interface is entirely dependent on what you're using it to do. Different code using that interface will do wildly different things with them, and will therefore have wildly different effects when you violate the interface's contract. Most also won't break predictably. Their result will typically be undefined when you provide inputs that violate their contracts. – Servy Jul 24 '18 at 20:56
  • @Servy Could you give an example of what it can do, except performance hit =O(n) when using it in dictionary? I can't think of any situation that would give incorrect results – MistyK Jul 24 '18 at 20:57
  • @MistyK The OP said nothing about using it in a dictionary. That said, you could end up with multiple different equal objects as keys, but having different values, you'll get different results for different operations based on the order in which you take given actions (adding the same items to the dictionary in a different order changes the resulting dictionary), etc. – Servy Jul 24 '18 at 21:00
  • @Servy Makes sense, thanks. However if we consider two points equal according to Equals method then we don't really have two different values. The value behind each instance is different but if it's inside given tolerance then it wouldn't make any differences regardless of the order of operations. Ok it would give you sligthly different results in terms of numbers but if we assume that Equals was implemented correctly from class point of view then it doesn't really matter if we take one instance or another right? – MistyK Jul 24 '18 at 21:20
  • @MistyK No. Consider a tolerance of 5. Now add 4, 1, then 7 to the dictionary as keys. Now add 1, 4, then 7. The first dictionary has one pair, the second has two, despite you adding the same values to the dictionary. The problem is that the *Equals* method doesn't fulfill *its* contract. Not that the hash code doesn't fulfill its contract. – Servy Jul 24 '18 at 21:22
  • @Servy very good example, thanks! – MistyK Jul 24 '18 at 21:28
  • @Servy / MistyK - your discussion has certainly cleared up some uncertainty for me, thanks! – Rob Kite Jul 24 '18 at 21:40

1 Answers1

4

The bitter truth is that you can't implement correct Equals with arbitrary tolerance. Equals (see https://msdn.microsoft.com/en-us/library/336aedhh(v=vs.100).aspx for details) must be transitive i.e. (x.Equals(y) && y.Equals(z)) returns true if and only if x.Equals(z) returns true.

And here we can create a counter example for given Tolerance.ToleranceDecimal:

 Point3d x = new Point3d(-Tolerance.ToleranceDecimal * 2.0 / 3.0, 0, 0);
 Point3d y = new Point3d(0, 0, 0);
 Point3d z = new Point3d(Tolerance.ToleranceDecimal * 2.0 / 3.0, 0, 0);

As you can see

 x.Equals(y) == true
 y.Equals(z) == true

But

 x.Equals(z) == false

Since Equals implementation is incorrect we can't create corresponding GetHashCode, except degenerated (and useless)

 public override int GetHashCode() {
   return 0;
 }

because GetHashCode must return the same value for x and y if x.Equals(y) == true. In our case: let x < y and y = x + N * tolerance

 x equals to 
 x + tolerance / 2.0 equals to
 x + tolerance / 2.0 * 2 equals to
 x + tolerance / 2.0 * 3 equals to
 ...
 x + tolerance / 2.0 * 2 * N equals to
 y

which means that for any arbitrary x and y and non-zero tolerance GetHashCode must return the same value for any argument.

Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
  • Thanks all for the quick and concise answers. – Rob Kite Jul 24 '18 at 21:11
  • @Rob Kite: you are welcome! – Dmitry Bychenko Jul 24 '18 at 21:12
  • Thank you for the very clear and valuable post... but: Yes... this is perfectly true... and this made me questing some of my code... And after reading this post and the MSDN link I also read https://learn.microsoft.com/en-us/dotnet/api/system.single.equals?view=netframework-4.8. Here Microsoft self describes the introduction of a tolerance for determining the equality of floating points. So we can't use a tolerance for equality, but we should use a tolerance for equality... euh? – SvenL Jan 13 '20 at 19:43
  • @SvenL: comparing (please, note, just *comparing*, not *implementing* custom `Equals` and `GetHashCode` methods) two floating point values with tolerance is a good practice: `if (Math.Abs(left - right) < tolerance) {/* we assume left == right */}` – Dmitry Bychenko Jan 13 '20 at 20:04
  • I fully agree, but when implementing types like Vector2D or Point or Coordinate, something with floating point values at its core, tolerant equality is (sometimes) necessary. And sometimes it would be nice to allow this in the Equals, so the framework would treat these values as equals... but i see your point and so your post sent me back to the drawing board and rethink the approach i took. – SvenL Jan 13 '20 at 20:10
  • @SvenL: if you have to implement `Equals`, you can try *scaling* (or *rounding*) initial floating point values into `int`egers, e.g. `return ((int)(x * scale) == (int)(other.x * scale)) && ((int)(y * scale) == (int)(other.y * scale));` or `return (Math.Round(x * scale) == Math.Round(other.x * scale)) && (Math.Round(y * scale) == Math.Round(other.y * scale));` – Dmitry Bychenko Jan 13 '20 at 20:17