8

I need a hashCode implementation in Java which ignores the order of the fields in my class Edge. It should be possible that Node first could be Node second, and second could be Node first.

Here is my method is depend on the order:

public class Edge {
    private Node first, second;

    @Override
    public int hashCode() {
        int hash = 17;
        int hashMultiplikator = 79;
        hash = hashMultiplikator * hash
                + first.hashCode();
        hash = hashMultiplikator * hash
                + second.hashCode();
        return hash;
    }
}

Is there a way to compute a hash which is for the following Edges the same but unique?

Node n1 = new Node("a");
Node n2 = new Node("b");
Edge ab = new Edge(n1,n2);
Edge ba = new Edge(n2,n1);

ab.hashCode() == ba.hashCode() should be true.

Joachim Sauer
  • 302,674
  • 57
  • 556
  • 614
veote
  • 1,400
  • 9
  • 33
  • 62
  • 1
    This feels like an [XY problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). What is your background reason for trying to do this? There may be a better alternative. – Duncan Jones Jun 10 '13 at 06:48
  • @DuncanJones My guess is he wants to have a non-directional edge in some sort of hash collection. His current hash code is directional (it depends on which on is in which field) but he wants it to be directionally agnostic. – Brian Jun 10 '13 at 06:52
  • @Brian Deleted my comment. Saw it on second look. – Fildor Jun 10 '13 at 06:53

2 Answers2

10

You can use some sort of commutative operation instead of what you have now, like addition:

@Override
public int hashCode() {
    int hash = 17;
    int hashMultiplikator = 79;
    int hashSum = first.hashCode() + second.hashCode();
    hash = hashMultiplikator * hash * hashSum;
    return hash;
}

I'd recommend that you still use the multiplier since it provides some entropy to your hash code. See my answer here, which says:

Some good rules to follow for hashing are:

  • Mix up your operators. By mixing your operators, you can cause the results to vary more. Using simply x * y in this test, I had a very large number of collisions.
  • Use prime numbers for multiplication. Prime numbers have interesting binary properties that cause multiplication to be more volatile.
  • Avoid using shift operators (unless you really know what you're doing). They insert lots of zeroes or ones into the binary of the number, decreasing volatility of other operations and potentially even shrinking your possible number of outputs.
Community
  • 1
  • 1
Brian
  • 17,079
  • 6
  • 43
  • 66
  • 1
    I doubt that you solution has a great advantage over my solution because your solution has only an offset to my solution (`hashMultiplukator * hash`). – Uwe Plonus Jun 10 '13 at 07:16
  • Even now it is only slight improvement as it is only a multiplicative offset ;) – Uwe Plonus Jun 10 '13 at 07:25
  • @UwePlonus Multiplying by a prime number, however, results in less chance of a collision because `HashMap` and `HashSet` always use a power of `2` for their table size, which means that prime number multiplication results in fewer collisions. It has to do with a concept called "orthogonal prime factorization". See [this question](http://stackoverflow.com/questions/3613102/why-use-a-prime-number-in-hashcode) for more. – Brian Jun 10 '13 at 07:31
  • @veote It doesn't *have* to be unique. See [this question](http://stackoverflow.com/questions/9398034/why-does-object-hashcode-have-conflicts-in-java). – Brian Jun 10 '13 at 07:32
  • @UwePlonus [More information about the math if you're curious.](http://stackoverflow.com/a/1147232/646634) – Brian Jun 10 '13 at 07:37
  • 2
    what is the purpose of `hashMultiplikator`? and the one of `hash`?? – UmNyobe Jun 10 '13 at 07:57
  • @UmNyobe see the questions that I linked in my two replies to UwePlonus. They deal with the math. Having both isn't necessary I suppose, but you definitely need at least one of them. – Brian Jun 10 '13 at 08:01
2

To solve you problem you have to combine both hashCodes of the components.

An example could be:

@Override
public int hashCode() {
    int prime = 17;
    return prime * (first.hashCode() + second.hashCode());
}

Please check if this matches your requirements. Also a multiplikation or an XOR insted of an addition could be possible.

Uwe Plonus
  • 9,803
  • 4
  • 41
  • 48
  • 1
    If the base hashes are already well distributed, this might work well. If otherwise they evaluate to very similar numbers, you'll end up with many collisions... You'd better test the function on your data before making a decision. – Flavio Jun 10 '13 at 07:11
  • What @Flavio said. Hash codes need to be volatile to avoid collisions, which is best done using the multipliers that the OP has in the original code. – Brian Jun 10 '13 at 07:13
  • @Brian Your solution has only an offset. If you have at least two values your solution is OK but in this case we only have one value (`hashSum`) therefore the multiplication and addition in your solution don't distribute the hashes better. – Uwe Plonus Jun 10 '13 at 07:19