32

Eclipse source menu has a "generate hashCode / equals method" which generates functions like the one below.

String name; 
@Override
public int hashCode()
{
    final int prime = 31;
    int result = 1;
    result = prime * result + ((name == null) ? 0 : name.hashCode());
    return result;
}

@Override
public boolean equals(Object obj)
{
    if (this == obj)
        return true;
    if (obj == null)
        return false;
    if (getClass() != obj.getClass())
        return false;
    CompanyRole other = (CompanyRole) obj;
    if (name == null)
    {
        if (other.name != null)
            return false;
    } else if (!name.equals(other.name))
        return false;
    return true;
}

If I select multiple fields when generating hashCode() and equals() Eclipse uses the same pattern shown above.

I am not an expert on hash functions and I would like to know how "good" the generated hash function is? What are situations where it will break down and cause too many collisions?

ams
  • 60,316
  • 68
  • 200
  • 288
  • I suppose it good enough. Joshua Bloch is his book (Effective Java) cites as an example the same implementation of hashCode() and equals(). – Nestor Aug 03 '12 at 11:54
  • 3
    It's just funny to realize that none of the answers can explain why this code is good. It looks like engineers like the Joshua Bloch recipe, but, unfortunately he seems to be the only one that understood why this recipe is good. Update : even Joshua says "Even the use of a prime number is less clear, but it is traditional". So I guess, even him had to leave this question behind to fellow mathematicians. – Snicolas Dec 16 '13 at 11:29
  • 1
    Bloch also writes "while this recipe yields reasonably good hash functions, it does not yield state-of-the-art hash functions, nor do the Java platform libraries provide such hash functions as of release 1.6. Writing such hash functions is a research topic, best left to mathematicians and theoretical cumputer scientists." So if you really want to know why, you'd better go back to uni ;) – Adam Jun 26 '14 at 08:53

8 Answers8

19

You can see the implementation of hashCode function in java.util.ArrayList as

public int hashCode() {
    int hashCode = 1;
    Iterator<E> i = iterator();
    while (i.hasNext()) {
        E obj = i.next();
        hashCode = 31*hashCode + (obj==null ? 0 : obj.hashCode());
    }
    return hashCode;
}

It is one such example and your Eclipse generated code follows a similar way of implementing it. But if you feel that you have to implement your hashCode by your own, there are some good guidelines given by Joshua Bloch in his famous book Effective Java. I will post those important points from Item 9 of that book. Those are,

  1. Store some constant nonzero value, say, 17, in an int variable called result.
  2. For each significant field f in your object (each field taken into account by the equals method, that is), do the following:

    a. Compute an int hash code c for the field:

    i. If the field is a boolean, compute (f ? 1 : 0).

    ii. If the field is a byte, char, short, or int, compute (int) f.

    iii. If the field is a long, compute (int) (f ^ (f >>> 32)).

    iv. If the field is a float, compute Float.floatToIntBits(f).

    v. If the field is a double, compute Double.doubleToLongBits(f), and then hash the resulting long as in step 2.a.iii.

    vi. If the field is an object reference and this class’s equals method compares the field by recursively invoking equals, recursively invoke hashCode on the field. If a more complex comparison is required, compute a “canonical representation” for this field and invoke hashCode on the canonical representation. If the value of the field is null, return 0 (or some other constant, but 0 is traditional)

    vii. If the field is an array, treat it as if each element were a separate field. That is, compute a hash code for each significant element by applying these rules recursively, and combine these values per step 2.b. If every element in an array field is significant, you can use one of the Arrays.hashCode methods added in release 1.5.

    b. Combine the hash code c computed in step 2.a into result as follows:

       result = 31 * result + c;
    
  3. Return result.

  4. When you are finished writing the hashCode method, ask yourself whether equal instances have equal hash codes. Write unit tests to verify your intuition! If equal instances have unequal hash codes, figure out why and fix the problem.

Java language designers and Eclipse seem to follow similar guidelines I suppose. Happy coding. Cheers.

sakthisundar
  • 3,278
  • 3
  • 16
  • 29
14

Since Java 7 you can use java.util.Objects to write short and elegant methods:

class Foo {
  private String name;
  private String id;

  @Override
  public int hashCode() {
    return Objects.hash(name,id);
  }

  @Override
  public boolean equals(Object obj) {
    if (obj instanceof Foo) {
      Foo right = (Foo) obj;
      return Objects.equals(name,right.name) && Objects.equals(id,right.id);
    }
    return false;
  }
}
leftbit
  • 848
  • 1
  • 7
  • 18
5

Generally it is good, but:

  1. Guava does it somehow better, I prefer it. [EDIT: It seems that as of JDK7 Java provides a similar hash function].
  2. Some frameworks can cause problems when accessing fields directly instead of using setters/getters, like Hibernate for example. For some fields that Hibernate creates lazy, it creates a proxy not the real object. Only calling the getter will make Hibernate go for the real value in the database.
Community
  • 1
  • 1
Eugene
  • 117,005
  • 15
  • 201
  • 306
  • could you elaborate on that hibernate part ? – Sikorski Aug 03 '12 at 12:02
  • @Eugene: > Guava does it somehow better, I prefer it. I am curious - what is somehow better? – jayeff Aug 03 '12 at 12:48
  • @jayeff Not much, you can find my reasons here : http://code.google.com/p/guava-libraries/wiki/CommonObjectUtilitiesExplained – Eugene Aug 03 '12 at 13:05
  • I have guava on my classpath, I am using it for caching and immutable collections but I have not used its built in hashing algos yet. – ams Aug 03 '12 at 14:13
4

Yes, it is perfect :) You will see this approach almost everywhere in the Java source code.

Martijn Courteaux
  • 67,591
  • 47
  • 198
  • 287
  • 2
    have you tried to let it generate a hashcode for a single integer field, that reprsents an id ? I think a simple "return id" would be more suitable. – AlexWien Mar 18 '15 at 19:32
0

It's a standard way of writing hash functions. However, you can improve/simplify it if you have some knowledge about the fields. E.g. you can ommit the null check, if your class guarantees that the field never be null (applies to equals() as well). Or you can of delegate the field's hash code if only one field is used.

Heiko Schmitz
  • 300
  • 1
  • 4
0

I would also like to add a reference to Item 9, in Effective Java 2nd Edition by Joshua Bloch.

Here is a recipe from Item 9 : ALWAYS OVERRIDE HASHCODE WHEN YOU OVERRIDE EQUALS

  1. Store some constant nonzero value, say, 17, in an int variable called result.
  2. For each significant field f in your object (each field taken into account by the equals method, that is), do the following:
    a. Compute an int hash code c for the field:            
        i.   If the field is a boolean, compute (f ? 1 : 0).
        ii.  If the field is a byte, char, short, or int, compute (int) f.
        iii. If the field is a long,compute(int)(f^(f>>>32)).
        iv.  If the field is a float, compute Float.floatToIntBits(f).
        v.   If the field is a double, compute Double.doubleToLongBits(f), and then hash the resulting long as in step 2.a.iii.
        vi.  If the field is an object reference and this class’s equals method compares the field by recursively invoking equals, recursively invoke hashCode on the field. If a more complex comparison is required, compute a “canonical representation” for this field and invoke hashCode on the canonical representation. If the value of the field is null, return 0 (or some other constant, but 0 is traditional).
        vii. If the field is an array, treat it as if each element were a separate field. That is, compute a hash code for each significant element by applying these rules recursively, and combine these values per step 2.b. If every element in an array field is significant, you can use one of the Arrays.hashCode methods added in release 1.5. 
   b. Combine the hash code c computed in step 2.a into result as follows: result = 31 * result + c;
3. Return result.
4. When you are finished writing the hashCode method, ask yourself whether equal instances have equal hash codes. Write unit tests to verify your intuition! If equal instances have unequal hash codes, figure out why and fix the problem.
Ashutosh Jindal
  • 18,501
  • 4
  • 62
  • 91
0

If you are using Apache Software Foundation (commons-lang library) then below classes will help you to generate hashcode/equals/toString methods using reflection.

You don't need to worry about regenerating hashcode/equals/toString methods when you add/remove instance variables.

EqualsBuilder - This class provides methods to build a good equals method for any class. It follows rules laid out in Effective Java , by Joshua Bloch. In particular the rule for comparing doubles, floats, and arrays can be tricky. Also, making sure that equals() and hashCode() are consistent can be difficult.

HashCodeBuilder - This class enables a good hashCode method to be built for any class. It follows the rules laid out in the book Effective Java by Joshua Bloch. Writing a good hashCode method is actually quite difficult. This class aims to simplify the process.

ReflectionToStringBuilder - This class uses reflection to determine the fields to append. Because these fields are usually private, the class uses AccessibleObject.setAccessible(java.lang.reflect.AccessibleObject[], boolean) to change the visibility of the fields. This will fail under a security manager, unless the appropriate permissions are set up correctly.

Maven Dependency:

<dependency>
        <groupId>commons-lang</groupId>
        <artifactId>commons-lang</artifactId>
        <version>${commons.lang.version}</version>
</dependency>

Sample Code:

import org.apache.commons.lang.builder.EqualsBuilder;
import org.apache.commons.lang.builder.HashCodeBuilder;
import org.apache.commons.lang.builder.ReflectionToStringBuilder;

public class Test{

    instance variables...
    ....

    getter/setter methods...
    ....

    @Override
    public String toString() {
        return ReflectionToStringBuilder.toString(this);
    }

    @Override
    public int hashCode() {
        return HashCodeBuilder.reflectionHashCode(this);
    }

    @Override
    public boolean equals(Object obj) {
        return EqualsBuilder.reflectionEquals(this, obj);
    }
}
2787184
  • 3,749
  • 10
  • 47
  • 81
0

One potential drawback is that all objects with null fields will have a hash code of 31, thus there could be many potential collisions between objects that only contain null fields. This would make for slower lookups in Maps.

This can occur when you have a Map whose key type has multiple subclasses. For example, if you had a HashMap<Object, Object>, you could have many key values whose hash code was 31. Admittedly, this won't occur that often. If you like, you could randomly change the values of the prime to something besides 31, and lessen the probability of collisions.

kc2001
  • 5,008
  • 4
  • 51
  • 92