15

When writing unit-tests, I often face the situation when equals() for some object in tests -- in assertEquals -- should work differently from how it works in actual environment. Take for example some interface ReportConfig. It has id and several other fields. Logically, one config equals to another one when their ids match. But when it comes to testing some specific implementation, say, XmlReportConfig, obviously I want to match all fields. One solution is not to use equals in tests and just iterate over the object properties or fields and compare them, but it doesn't seem like a good solution.

So, apart from this specific type of situations, I want to sort out what are best practices to implement equals, semantically, not technically.

Gray
  • 115,027
  • 24
  • 293
  • 354
Andrey Balaguta
  • 1,308
  • 2
  • 21
  • 28
  • 10
    `When writing unit-tests, I often face the situation when equals() for some object in tests -- in assertEquals -- should work differently from how it works in actual environment.` Why would you ever want this? – helpermethod Mar 16 '12 at 14:34
  • 1
    Because there's identity, and equality. For a lot of situations, identity is simply good enough, especially in DB apps. In testing, though, there can easily be situations where you're, perhaps, cloning a "before" copy to compare to an "after" copy, and in these cases identity isn't enough, you want to go deeper in to the actual fields of the object. – Will Hartung Mar 16 '12 at 15:52

5 Answers5

21

what are best practices to implement equals, semantically, not technically.

In Java the equals method really should be considered to be "identity equals" because of how it integrates with Collection and Map implementations. Consider the following:

 public class Foo() {
    int id;
    String stuff;
 }

 Foo foo1 = new Foo(10, "stuff"); 
 fooSet.add(foo1);
 ...
 Foo foo2 = new Foo(10, "other stuff"); 
 fooSet.add(foo2);

If Foo identity is the id field then the 2nd fooSet.add(...) should not add another element to the Set but should return false since foo1 and foo2 have the same id. If you define Foo.equals (and hashCode) method to include both the id and the stuff fields then this might be broken since the Set may contain 2 references to the object with the same id field.

If you are not storing your objects in a Collection (or Map) then you don't have to define the equals method this way, however it is considered by many to be bad form. If in the future you do store it in a Collection then things will be broken.

If I need to test for equality of all fields, I tend to write another method. Something like equalsAllFields(Object obj) or some such.

Then you would do something like:

assertTrue(obj1.equalsAllFields(obj2));

In addition, a proper practice is to not define equals methods which take into account mutable fields. The problem also gets difficult when we start talking about class hierarchies. If a child object defines equals as a combination of its local fields and the base class equals then its symmetry has been violated:

 Point p = new Point(1, 2);
 // ColoredPoint extends Point
 ColoredPoint c = new ColoredPoint(1, 2, Color.RED);
 // this is true because both points are at the location 1, 2
 assertTrue(p.equals(c));
 // however, this would return false because the Point p does not have a color
 assertFalse(c.equals(p));

Some more reading I would highly recommend is the "Pitfall #3: Defining equals in terms of mutable fields" section in this great page:

How to Write an Equality Method in Java

Some additional links:

Oh, and just for posterity, regardless of what fields you choose to compare to determine equality, you need to use the same fields in the hashCode calculation. equals and hashCode must be symmetric. If two objects are equals, they must have the same hash-code. The opposite is not necessarily true.

Gray
  • 115,027
  • 24
  • 293
  • 354
  • "If you are not storing your objects in a Collection then you don't have to define the equals method based on identity" The whole point of OO is that your class represents an independent abstraction. You would be leaking down requirements and tweaking your class interface for particular purposes which can render it meaningless. – DPM Mar 16 '12 at 15:06
  • Although academically correct @Jubbat, the Java `Collection`s impact just about every part of the JDK. You can ignore them to be a OO purist but you do so at your peril. – Gray Mar 16 '12 at 15:10
  • Still think it's better to represent invariants like unique identifiers in the class documentation and use assertions for them (see my answer below) – DPM Mar 16 '12 at 15:14
  • @Jubbat. We disagree about this _and_ asserts. Read this link for more information. http://www.artima.com/lejava/articles/equality.html – Gray Mar 16 '12 at 15:17
  • I agree with the article. Which part is in conflict with what I have written? – DPM Mar 16 '12 at 15:22
  • Ok, I see: "Defining equals in terms of mutable fields". You might have a point there. I have to think about that one... – DPM Mar 16 '12 at 15:26
  • @Jubbat Yeah. Also, take a look at the class hierarchy issues around equality and symmetry. This issue bleeds into database persistence and other areas as well. – Gray Mar 16 '12 at 15:31
  • In relation of class hierarchy, equals implementation should be avoided altogether or eliminate the hierarchy in favor of composition. It's simply not possible to have a hierarchy where you have a "significant" new variable in the child class and an equals method that binds to its contract. But that is a different matter of what I originally argued. – DPM Mar 16 '12 at 16:17
  • @Jubbat, on the last one, your point is valid for obvious abstractions like Point and Circle, but what about not so obvious cases? For example, we have `ReportConfig` basic interface, which defines `id` and several other properties. And then we have `FileBasedReportConfig` which has additional property `templateFile`. Does `equals` makes sense in this case? Surely. How to implement it correctly -- I'm not so sure... – Andrey Balaguta Mar 17 '12 at 09:13
  • @AndreyBalaguta: Conceptually, one may define equality by saying that any legitimate derivatives of a type which may compare equal to something of a different derivative of that type must have a canonical representation to which it can be converted, and two things regardless of type are equal if they would yield the same canonical representation. For example, one could have an abstract `MatrixOfDouble` type with subtypes `ArrayBackedMatrixOfDouble`, `DiagonalMatrixOfDouble`, and `ZeroMatrixOfDouble`, and specify that the canonical form for any of them is a collection of cells. – supercat Nov 12 '13 at 17:49
  • @AndreyBalaguta: Given a `DiagonalMatrixOfDouble` whose diagonal contained 500 values, and a `ZeroMatrixOfDouble` whose length and width were both 500, it would be possible to compare them by converting each to a 500x500 matrix and observing that all 250,000 values matched, but if either type knows about the other it may be able to do the comparison faster [e.g. the diagonal matrix may know that it can match the zero matrix iff all elements on the diagonal are zero]. – supercat Nov 12 '13 at 17:52
  • The first problem you are referring to has nothing to do with `Collection` or even `equals`. It's the implementation of `HashMap` (used by `HashSet` internally) that causes troubles. `HashSet` breaks the contract with `Set`. `TreeSet` does that as well, but at least that's explicitly documented. The last issue you describe is based on how `equals` is implemented in the base class, not on what fields you use. – a better oliver Mar 10 '15 at 13:16
  • I'd argue @zeroflagL that this is the same for all `Collection`s: `List`, `Queue`, and `Set` but good point about missing `Map` -- I've added it. If you say `list.contains(...)`, `equals(...)` is used and identity equals is the best pattern. In terms of the second point, it's actually the fields that the base class has chosen to use for equality. – Gray Mar 10 '15 at 14:57
  • I agree with you about "identity equals" (although it has its own pitfalls), but that's not the problem in your example. Use another implementation of `Set` and things work as expected, identity equals or not. The problem with `Point` is that it doesn't care about subtypes. `p.equals(c)` should return `false` in the first place. I particular if you follow the idea of "identity equals": A colored point is not the same as a simple point. One could argue that `Point` should either be `final` or consider subtypes in its `equals` method. – a better oliver Mar 10 '15 at 15:29
  • Any `Set` implementation that uses `equals(...)` will have the problem if you are not defining equals as "identity equals", and that means all `Set` implementations. Right? I don't quite understand your point @zeroflagL. – Gray Mar 10 '15 at 15:49
  • 1
    No. The problem with `HashSet` is that `equals` is **not** called in your example. If `equals` was called then it would return `true`, because `foo1.equals(foo1) == true`. So no second reference would be added to the set. – a better oliver Mar 10 '15 at 16:11
  • Oh I see @zeroflagL. Good point. I've changed it to make sure it is a different object instance. – Gray Mar 11 '15 at 13:36
  • Your "identity equals" link is broken. – Basil Bourque Sep 21 '18 at 20:52
  • 1
    FYI… Regarding "If Foo identity is the id field then the 2nd fooSet.add should not add another element to the Set but should replace the first one since they have the same id.", the [`Set::add`](https://docs.oracle.com/javase/10/docs/api/java/util/Set.html#add(E)) contract promises the opposite: *If this set already contains the element, the call leaves the set unchanged and returns false.* – Basil Bourque Sep 21 '18 at 22:56
  • Uh, wow. Not sure how I missed that. Fixed. Thanks @BasilBourque – Gray Sep 24 '18 at 16:12
4

Copied from Object.equals(Object obj) javadoc:

Indicates whether some other object is "equal to" this one.

The equals method implements an equivalence relation on non-null object references:

  • It is reflexive: for any non-null reference value x, x.equals(x) should return true.
  • It is symmetric: for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
  • It is transitive: for any non-null reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
  • It is consistent: for any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.
  • For any non-null reference value x, x.equals(null) should return false.

That's pretty clear to me, that is how equals should work. As for which fields to choose, you choose whichever combination of fields is required to determine whether some other object is "equal to" this one.

As for your specific case, if you, in your test, need a broader scope for equality, then you implement that in your test. You shouldn't hack your equals method just to make it fit.

pap
  • 27,064
  • 6
  • 41
  • 46
1

You should use all significant variables, ie variables whose value is not derived from the others, in equals.

This is from Effective Java:

For each “significant” field in the class, check if that field of the argument matches the corresponding field of this object.

If you want to match ids because it's a unique identifier for that class then just compare the id value, don't use equals in that case.

If you have a unique identifier, the language doesn't allow you to enforce that there is no other object with that identifier or that the rest of variables' values match. However, you can define that in the documentation of the class and you can use assertions in the equals implementation or elsewhere as it is an invariant given by your class' semantics.

DPM
  • 1,960
  • 3
  • 26
  • 49
  • 3
    "If you want to match ids because it's a unique identifier for that class then just compare the id value, don't use equals in that case." -- what about putting instances into a `Set` or `Map`? I need correct equals for that, and it is comparing them by id. What about this situation? – Andrey Balaguta Mar 17 '12 at 09:02
  • @AndreyBalaguta: If the id is really supposed to be unique, it shouldn't be necessary to override `Equals`. Instead, use some sort of `Map` to hold a reference to the only `YourType` that will exist for any given id. – supercat Nov 12 '13 at 17:54
1

I would not think about the unit test when writing a equals() both are different.

You define the equality of each object with one or group of properties by implementing equals() and hashcode().

In your test if you want to compare all the properties of the object, then obviously you need to call each method.

I think it is better to treat them separately.

ManuPK
  • 11,623
  • 10
  • 57
  • 76
0

I think the only best practice when overriding the equals() method is common sense.

There are no rules, apart from the equivalence definition of the Java API. Once you've chosen that definition of the equality, you have to apply it to your hashCode() method as well.

By that I mean, as a developer, you, your co-workers and maintainers should know when your instance of let's say a Watermelon equals another Object instance.

Nicolas C.
  • 983
  • 1
  • 8
  • 12