47

How can I test the hashCode() function in unit testing?

public int hashCode(){
    int result = 17 + hashDouble(re);
    result = 31 * result + hashDouble(im);
    return result;
}
Pang
  • 9,564
  • 146
  • 81
  • 122
Tomasz Gutkowski
  • 1,388
  • 4
  • 20
  • 28

6 Answers6

98

Whenever I override equals and hash code, I write unit tests that follow Joshua Bloch's recommendations in "Effective Java" Chapter 3. I make sure that equals and hash code are reflexive, symmetric, and transitive. I also make sure that "not equals" works properly for all the data members.

When I check the call to equals, I also make sure that the hashCode behaves as it should. Like this:

@Test
public void testEquals_Symmetric() {
    Person x = new Person("Foo Bar");  // equals and hashCode check name field value
    Person y = new Person("Foo Bar");
    Assert.assertTrue(x.equals(y) && y.equals(x));
    Assert.assertTrue(x.hashCode() == y.hashCode());
}
duffymo
  • 305,152
  • 44
  • 369
  • 561
  • 16
    On top of this, you could add that it would be reasonable to test that modifications to non-key fields do not cause a modified hashCode to be generated. Also that modifications to key fields do cause modified hashCodes. – Ben Hardy Feb 05 '12 at 17:48
  • 1
    I think we should not test another method, as this test is just for hashcode so must only hash codes should be compared and the rest of the code should be taken into equals test and should be tested separately – AZ_ Nov 23 '16 at 10:58
  • Six years later and I still disagree with you. No difference that I can see. They need to be overridden together; they can be tested together. If you require separate tests, knock yourself out. – duffymo Nov 23 '16 at 11:23
  • @duffymo why are you testing .equals with x and y? should you be testing the symmetric property of the x.hashCode() == y.hashCode() && y.hashCode() == x.hashcode() ??? – ennth Feb 11 '20 at 22:51
  • Nope. I already know that == is symmetric. You realize that this question and answer are almost ten years old? – duffymo Feb 12 '20 at 01:46
6

When you write a mathematical function in general (like hash code) you test some examples in your tests until you are convinced that the function works as expected. How many examples that are depends on your function.

For a hash code function I'd think you test at least that two distinct objects, that are considered equal have the same hash code. Like

assertNotSame(obj1, obj2); // don't cheat
assertEquals(obj1.hashcode(), obj2.hashcode());

Further you should test that two different values have different hash codes to avoid implementing hashcode() like return 1;.

DerMike
  • 15,594
  • 13
  • 50
  • 63
  • 1
    Be aware that Java does not require unequal objects to have different results. So it could be in some cases that the hash code is the same for two unequal objects without violating Java guidelines. https://docs.oracle.com/javase/8/docs/api/java/lang/Object.html#hashCode-- – dStulle Nov 28 '18 at 09:37
4

hashCode is overrided so as to make instances with same fields identical for HashSet/HashMap etc. So Junit test should assert that two different instances with same values return identical hashCode.

Danubian Sailor
  • 1
  • 38
  • 145
  • 223
3

Create many (millions of) reproduceably random objects and add all the hashCodes to a Set and check you get almost and many unqiue values as the number of generate ids. To make them reproduceable random use a fixed random seed.

Additionally check you can add these Items to a HashSet and find them again. (Using a differnt object with the same values)

Make sure your equals() matches your hashCode() behaviour. I would also check that your fields are all final.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • 26
    I've downvoted this for the simple reason that adding randomness to a unit test is a Bad Thing - since the first thing you want to know is *why* the test failed - which is very difficult if its inputs are random. Plus you may need to run billions of objects through your hashcode method to get confidence, and that may mean your unit test takes a very long time to run, which is also a bad thing. – pauljwilliams Dec 15 '10 at 12:22
  • 1
    No reason given for the down votes. An explaination would be helpful. – Peter Lawrey Dec 15 '10 at 12:22
  • 2
    @Visage, How does using non-random data tell you why a test fails, all you need is reproduceability to help diagnose a failed test. You cannot achieve the level of proof you claim with any realistic test driven development. However, you can say when the test fails that you have a problem. – Peter Lawrey Dec 15 '10 at 12:25
  • 1
    @Visage, billions isn't that many, but millions would find most bugs. Just a single test can find a bugs surprising often. – Peter Lawrey Dec 15 '10 at 12:28
  • 2
    I don't think it's a bad answer. If the randomness is used to build up a statistical profile, why is that bad? And the length of time it takes to run shouldn't be the deciding factor in whether or not to write a test. It's possible to divide your tests into fast ones that you run every time and longer-running tests that are at your discretion and don't need to be run unless changes are made. Peter's answer doesn't deserve a down vote, in my opinion. – duffymo Dec 15 '10 at 12:30
  • @Visage: in some cases a random approach makes sense, e.g. when something is not supposed to fail or even crash. I write this from personal experience. However, I agree that this is not the way to test whether or not some functionality is working and/or creating proper results. – sjngm Dec 15 '10 at 12:30
  • @Visage, I would agree that its a bad thing to write unit tests which give a false sense of security. ;) – Peter Lawrey Dec 15 '10 at 12:30
  • @duffymo, I assume visage imagined random to mean unreproduceable, will edit. – Peter Lawrey Dec 15 '10 at 12:31
  • @ PaulJWilliams - Randomness is not a problem by itself. "since the first thing you want to know is why the test failed" Your asserts (and the test framework in general) should spit out all the relevant information required to reproduce the test in case of failure. In other words, if you write your random tests correctly, then it should be trivial to collect the specific data from a failed test and paste it into a new non-random test that reproduces the failure. – Steven Byks Jan 04 '17 at 16:30
  • @ PaulJWilliams "Plus you may need to run billions of objects through your hashcode method to get confidence, and that may mean your unit test takes a very long time to run, which is also a bad thing." Since the data is random, re-running the same old test will produce new results. In other words, instead of running many possibilities all at once, you could have your nightly build run just a few on a daily basis. Over time this adds up to a high confidence without making a single run prohibitively long. – Steven Byks Jan 04 '17 at 16:33
1

Apart from @duffymo test for hashcode being reflexive, symmetric, and transitive, another way of testing would be via "Map" which is where hashcodes actually come handy.

 @Test
public void testHashcode() {
    Person p1 = new Person("Foo Bar"); 
    Person p2 = new Person("Foo Bar");
    Map<Person, String> map = new HashMap<>();
    map.put(p1, "dummy");
    Assert.assertEquals("dummy", map.get(p2));
}
1

I don't think there's a need to unit-test a hashcode method. Especially if it is generated by either your IDE or a HashCodeBuilder (apache commons)

Bozho
  • 588,226
  • 146
  • 1,060
  • 1,140
  • 1
    Reading the code, is possibly the best check to ensure it makes sense. i.e. its hard to find pathelogical cases by trial and error. – Peter Lawrey Dec 15 '10 at 12:19