21

Object javadocs and Josh Bloch tell us a great deal about how hashCode/equals should be implemented, and good IDEs will handle fields of various types correctly. Some discussion about all that is here.

This question is about the next step: How do you make sure that they remain good?

In particular, I feel that for most Classes, equals/hashCode should be implemented as Bloch suggests (and Eclipse and other IDE's implement), and take in to account all non-derived, business-logic, fields on that class. While adding new fields to a class as part of continuing work, people often forget to add them to the equals/hashCode implementation. This can lead to hard-to-find bugs, when two objects appear equal, but in fact differ by the value of a recently introduced field.

How can a team (even of one!) help ensure that the equals/hashCode on a class continue to take in to account all relevant fields, as the member fields change?

I know that Apache's EqualsBuilder and HashCodeBuilder can use reflection, which obviously would take in to account the correct fields, but I want to avoid the performance costs of using them. Are there other approaches to flag up fields that are not included in equals/hashCode, and should be? Static code analysis, IDE features, unit test techniques?

Community
  • 1
  • 1
sharakan
  • 6,821
  • 1
  • 34
  • 61
  • 2
    Don't be concerned about the performance of reflection until you have a profiler telling you to be. You're probably using it all over the place without knowing it anyway. – Ryan Stewart Oct 09 '11 at 15:09
  • @RyanStewart Good point, should've been more clear. For some classes it's not a concern, but for some it is (demonstrated with profiler), and I want a way to ensure equals/hashCode are dealing with fields that doesn't use reflection with the ensuing runtime overhead. – sharakan Oct 09 '11 at 15:47

6 Answers6

8

Never tried it, but how about http://code.google.com/p/equalsverifier/?

artbristol
  • 32,010
  • 5
  • 70
  • 103
  • Looks promising, I'll try it out. – sharakan Apr 11 '12 at 18:16
  • +1 Nice. It looks like it can even suppress fields that you know are not relevant. It seems just like what I've been looking for too. – Alan Escreet Apr 12 '12 at 06:46
  • I'd like to use this, but to solve my problem I'd want to use the 'allFieldsShouldBeUsed' flag, which unfortunately also includes static, final fields. I'm sure it'll be fixed soon (http://code.google.com/p/equalsverifier/issues/detail?can=2&start=0&num=100&q=&colspec=ID%20Type%20Status%20Priority%20Milestone%20Reporter%20Summary&groupby=&sort=&id=57), but for now it looks like Lombok is my answer. – sharakan Apr 15 '12 at 19:41
7

A potential answer seems to be offered in this question.

I haven't looked into Project Lombok much, but I immediately thought, hmm annotations would work with a code generator.

Community
  • 1
  • 1
Tim Bender
  • 20,112
  • 2
  • 49
  • 58
  • +1 This tool really is great. Automatically generating boiler-plate code such as equals() and hashCode() at compile-time means that any future updates to the class will automatically have those changes incorporated to the implementations of those methods. – Alan Escreet Apr 12 '12 at 17:02
1

How about you write unittests for each class you want to protect. The unit tests should

  1. Use reflection to count the number of fields, and compare with locally stored field names (and types).
  2. If a change in number of fields (or type of fields) has been detected, then invoke hashcode. If the value is the same, then fail the test case. Alert the developer that fields of a class have been changed, and that the developer should : Update the equals and Hashcose, and update the unittest.
  3. Make sure the unittest runs in a nightly build.
rk2010
  • 3,481
  • 7
  • 27
  • 39
  • 3
    This wouldn't help. If the developer who adds a new field also doesn't edit the unit test, the test will continue to pass on the old logic, even though the class fields are now out-of-sync with the equals and hashCode methods. – Alan Escreet Apr 11 '12 at 16:01
  • I think the poster's point is that the value of hashcode should not be the same if the number of fields have changed. There seems to be an implied persistence layer... or the general concept that this unit test would have two hardcoded fields, an int indicating the number of fields in the object being tested when the test was last updated and a long indicating the value of the hashcode. But, yeah, fragile reminder at best. – Tim Bender Apr 11 '12 at 16:10
  • 2
    The unit test could use reflection to fill all fields of two objects differing on one field and then test `equals`. – Joop Eggen Apr 11 '12 at 16:11
  • @JoopEggen, that seems like a much better approach. You should propose it as an answer. The only way in which it is then fragile is that it relies on every field being integral to the determination of equality (which is not always the case, but sounds like it holds true for the OP). – Tim Bender Apr 11 '12 at 16:32
  • @TimBender : thanks, also for pointing out that my solution does not hold where equivalency is taken as equality. Seems that there are sufficient answers. – Joop Eggen Apr 11 '12 at 16:50
  • 1
    @JoopEggen How about.. UnitTest maintains a list of Fields. Each time the test is executed, it compares them with the Fields it finds at run-time using Reflection. If there is any mismatch in the set of Fields found, then it will fail the test case. The failure msg / java comment, will alert the developer that they have to update update the "Equals+Hashcode" and update this UnitTest, each time a new field is added to the class. – rk2010 Apr 11 '12 at 18:38
  • @rk2010 yes that would be feasible. Maybe also FindBugs would find incomplete `equals`. – Joop Eggen Apr 11 '12 at 21:06
0

Sounds like what you're looking for is an addon or feature in an IDE that performs the analysis of the classes and generates warnings if the equals() and hashCode() methods do not reference all the relevant fields.

This would basically move the reflection overhead to the IDE rather than let it have the cost during run-time.

Unfortunately, I don't know of any addon that does this, though I'll certainly have a look.

On the other hand, how would such an automated tool know which fields are relevant for business logic? You'd probably need to mark irrelevant fields with something like an annotation, otherwise you could find yourself with warnings you can't get rid of, but that you know are incorrect.

Edit: This answer is not useful, so I'm just compiling here the really useful suggestions from better answers:

  • Project Lombok - Eclipse addon that also works with build tools to provide automatic generation of equals(), hashCode() and other boiler-plate code - since it's generated at compile-time, any changes to the class will automatically update these methods too
  • equalsverifier - Provides a way to automatically unit test equals() with reflection - doesn't support testing hashCode()
Alan Escreet
  • 3,499
  • 2
  • 22
  • 19
  • For such a tool, I'd be fine if it assumed everything was relevant. Even better would be if I could annotate fields that should not be included. – sharakan Apr 11 '12 at 18:09
0

You could serialize your objects to a string using a tool that finds your properties using reflection (XStream, for example), and store that string in a repository. Your unit tests could re-serialize your objects and compare the results to your stored values.

As part of your own process, make the storage of those strings contingent on manually validating that your hashCode and equals correctly capture all relevant values.

phatfingers
  • 9,770
  • 3
  • 30
  • 44
  • The idea being that tests fail whenever fields get added, as a reminder that one should update hashcode/equals? – sharakan Apr 11 '12 at 18:10
  • Yes-- added, removed, or renamed. Also, I submit that not every change to a field necessarily implies an impact to equality. Your automated test can detect that your class changed, but a proper evaluation of hashcode/equals will require some human intelligence. – phatfingers Apr 11 '12 at 18:57
0

You don't need to include every field in your hash code method. If you used more than one field in your hash code algorithm then the chances are it will remain good. The IDE-generated algorithms I've seen use a random (at the time of implementation, not at execution) prime constant value, so just make sure that classes that could end up in the same map or tree together (e.g. they implement the same interface) have different constant values. Unless, of course, you want equality at the interface level.

I've never seen a hash code algorithm "go bad" - you're worrying a disproportionate amount about this.

Paul
  • 19,704
  • 14
  • 78
  • 96
  • 1
    As the question states, I'm worried about hashCode AND equals. I prefer that all fields that affect equals also affects hashCode, as then consistency between the two is guaranteed (see the best answer to the question whose link I included). And for equals(), I want all relevant business fields included. – sharakan Oct 09 '11 at 15:54
  • Are you trying to follow a standard just because you want to follow a standard or are you solving some problem? Do you often have your equals and hash code methods go out of sync? If your team is modifying the attributes of your business objects so frequently that you need an automated way to fix this, then your team has bigger problems. Your first pass at a hash code should be good enough, if you've got a grasp of the objects and processes you're trying to capture. – Paul Oct 09 '11 at 22:08
  • 3
    Solving some problem, definitely. The problem is NOT that equals/hashcode are out of synch with EACH OTHER. It's that as new business logic fields get added (due to enhancements, say) they should be added to equals(). If you forget to update equals(), than equals() (and hashCode) is out of synch with the class definition. That's the case I'm trying to catch. – sharakan Oct 10 '11 at 12:55