10

I am implementing a class Pair to use it as a key with two-values for a HashMap. I use generics to keep the types of the fields variable. I managed to write the biggest part of the code:

public class Pair<L, R>
{
    private L left;
    private R right;


    Pair(L left, R right)
    {
        this.left = left;
        this.right = right;
    }


    public L getLeft()
    {
        return left;
    }


    public R getRight()
    {
        return right;
    }


    public void setLeft(L left)
    {
        this.left = left;
    }


    public void setRight(R right)
    {
        this.right = right;
    }


    @Override
    public boolean equals(Object obj)
    {
        if (obj instanceof Pair< ? , ? >)
        {
            Pair< ? , ? > pair = (Pair< ? , ? >)obj;
            return left.equals(pair.getLeft()) && right.equals(pair.getRight());
        }
        return false;
    }

    @Override
    public String toString()
    {
        return "Pair " + Integer.toHexString(hashCode()) + ": (" + left.toString() + ", " + right.toString()
               + ")";
    }
}

My problem is to create proper hashCode method, which definitely delivers the same hashcode for equal objects, and different hashcode for different objects. Some hints?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Danny Lo
  • 1,553
  • 4
  • 26
  • 48
  • Equal objects should have the same hash code, but two unequal objects do not necessarily have to have different hash codes (they should be different quite often, tho). – pholser Jun 26 '14 at 14:33

3 Answers3

22

Don’t reinvent the wheel.

Just use return Objects.hash(left, right);

Holger
  • 285,553
  • 42
  • 434
  • 765
  • This is exactly what I've been looking for! – Danny Lo Jun 26 '14 at 14:50
  • It can really help for java 1.7. But maybe, you can notice, what should we choose for Java 1.6 and lower? – Puzzle Dec 19 '15 at 23:15
  • @Puzzle: just copy [the implementation](http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7-b147/java/util/Objects.java#Objects.hash%28java.lang.Object%5B%5D%29). It’s surprisingly simple and works with 1.6 and even 1.5… – Holger Dec 21 '15 at 09:12
1

You are already relying on the left and right equals methods so why not also rely on their hashcodes?

@Override
public int hashCode()
{
    final int prime = 31;
    int result = 1;
    result = prime * result + (left ==null? 0 : left.hashCode());
    result = prime * result + (right ==null? 0 : right.hashCode());
    return result;
}
dkatzel
  • 31,188
  • 3
  • 63
  • 67
  • 2
    That’s what you get for free when using just `return Objects.hash(left, right);` At least Oracle’s Java 7 implementation uses exactly that. However, using a JRE method allows future improvements… – Holger Jun 26 '14 at 14:39
  • @Holger Yes, thanks for pointing that out, I didn't know about that method (still on JDK 6) However, it looks like `Objects.hash()` causes an extra array creation which probably doesn't matter in this case but is something to think about. – dkatzel Jun 26 '14 at 14:40
  • Good suggestion, I thought of using left and right hashcodes too, but didn't know how to use them best way. Anyway I'll better stick to native java implementation. – Danny Lo Jun 26 '14 at 14:59
0

This should do the trick (of course hash codes are never guaranteed to be different)

@Override
public int hashCode() {
    return (left.hashCode()+"/"+right.hashCode()).hashCode();
}

If left or right can be null, you need a bit more code to handle that.

Erich Kitzmueller
  • 36,381
  • 5
  • 80
  • 102
  • @Holger: What makes you think `left.hashCode()-right.hashCode()` provides *better* results? It's not difficult to imagine cases where `left.hashCode()-right.hashCode()` is a bad idea... For example, the `Integer` class returns the value as hashCode; so if you use this generic class to create pairs of Integer, it's trivially easy (and quite likely in normal use cases) to create hash collisions. – Erich Kitzmueller Jun 26 '14 at 14:51
  • `String.hashCode` is optimized for arbitrary `String`s, preferably human readable text. It’s not optimized for `String`s using at max 12 different characters out of the 65536 possible. Just because it is difficult to *see* the flaws does not mean it’s a better solution. Your solution is just hiding what’s going on. – Holger Jun 26 '14 at 14:56
  • I don't think that String.hashCode makes *any* assumptions about the content of the String. See http://stackoverflow.com/questions/299304/why-does-javas-hashcode-in-string-use-31-as-a-multiplier – Erich Kitzmueller Jun 26 '14 at 15:10
  • It’s not the method `String.hashCode` that makes any assumptions, it’s the set of test cases that have been used to prove that the result is sufficient which makes a difference. The original inventor of the “use 31” idea (I don’t think anyone knows who was the first one) did not need to know *why* it works. Still, hashing an `int` by hashing character by character of its `String` representation *is* using an algorithm designed for text just due to the “character by character” nature, independent from the used `31` number. – Holger Jun 26 '14 at 15:52
  • String.hashCode is probably the single most important implementation of hashCode() in the whole Java world. I'm confident it works on Chinese texts just as well as it does on strings containing just '0's and '1's, because so many different programs rely on it. Therefore, the deficiency of my proposal might be the overhead necessary to build the string, but I'll be damned if it is String.hashCode(). – Erich Kitzmueller Jun 26 '14 at 23:41
  • just recall that you have two `int` values, the hash codes of `left` and `right`. They are already `int`s, distributed over the entire `int` range. Now, what you basically do is to transform these `int`s like `String.valueOf(i).hashCode()` which obviously adds tons of collisions to these values which didn’t exist before (you may test it if it’s not so obvious to you). Arithmetically there’s no difference in whether you first perform this transformation and combine the poor results or concat the `String`s first and calculate the hash code afterwards. – Holger Jun 27 '14 at 07:43
  • Sorry, but this simply isn't correct. Try it. Find some hash collissions for my hash method, and tell me how many pairs you had to search to find a single one. For example, when creating all Integer pairs for the range 0-999, my method creates 1000000 different hash values, which is the best possible outcome. Objects.hash creates 31969 different hash values, so that's an awful lot of collissions. (But Objects.hash() is faster, unsurprisingly) For 100000000 pairs (ranges 0-9999 for both left and right), my method creates 99584634 different hashcodes, while Objects.hash creates just 319969. – Erich Kitzmueller Jun 27 '14 at 08:34
  • So you have a hash method that works good for small integers, but the question was to hash two *arbitrary* objects which will have *arbitrary* hash codes spread over the *entire* `int` *range*. It seems you didn’t even understand the problem. So it’s no wonder that you don’t understand the flaws of your solution. However, I will stop here. Everything relevant has been said. – Holger Jun 27 '14 at 08:41
  • When you implement a sparse matrix, you get *exactly* the kind of key I used for my tests. Where Objects.hash() horribly fails. Just sayin'. – Erich Kitzmueller Jun 27 '14 at 08:50
  • Just counting the number of collisions make sense for *one* value only. When combining two arbitrary `int` values to one resulting `int`, it is impossible not to have collisions. The important point here is to have an *equal distribution* of the resulting values for *all possibly input values*. You know, we are talking about `2⁶⁴` possible input values and `2³²` result values for which you had to count the number of collisions individually. You can’t test that the way you did. Your hash code might still be useful for particular use cases but can’t be an answer to a generic hash code question. – Holger Jun 27 '14 at 09:53
  • I totally aggree that an equal distributon for all possible input values is a must, and, considering the birthday paradox, we have to expect a certain amount of hash collissions even for relatively smaller key sets. That said, a generic solution that delivers subpar results for some common real-world use cases can't be called ideal IMO. – Erich Kitzmueller Jun 27 '14 at 10:53