44

I am converting the incoming string into hash code by doing the following function but some of the values are negative. I don't think hash values should be negative. Please tell me what I am doing wrong.

int combine = (srcadd + dstadd + sourceport + destinationport + protocol).hashCode();
System.out.println(combine);
Whymarrh
  • 13,139
  • 14
  • 57
  • 108
Xara
  • 8,748
  • 16
  • 52
  • 82

3 Answers3

59

I don't think hash values should be negative.

Why not? It's entirely valid to have negative hash codes. Most ways of coming up with a hash code naturally end up with negative values, and anything dealing with them should take account of this. However, I'd consider a different approach to coming up with your hash codes, e.g.

int hash = 17;
hash = hash * 31 + srcadd.hashCode();
hash = hash * 31 + dstadd.hashCode();
hash = hash * 31 + sourceport; // I'm assuming this is an int...
hash = hash * 31 + destinationport; // ditto
hash = hash * 31 + protocol.hashCode();
return hash;

It's not clear what the types of these expressions are, but I'm guessing you're ending up taking the hash code of a string... a string that you don't really need to create in the first place. While there are better approaches for getting hash codes for known domains, the above approach works well as a general-purpose hash generation technique.

Note that it would also help the readability of your code if you avoided abbreviations, and used camel casing, e.g. sourceAddress instead of srcadd.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • 1
    Actually it was written at some forum that "hashCode is a way of computing a small (32-bit) digest numeric key from a long String ".So, i though its range is 2^32 and that is from 0 to 2^32 – Xara Feb 12 '12 at 15:37
  • 6
    @Zara: But `int` doesn't support numbers greater than 2^31 - 1... it *is* a 32-bit value, but in a signed range. – Jon Skeet Feb 12 '12 at 15:41
  • @JonSkeet Can you elaborate why you set hash = 17 and you multiple by 31? I'm a little lost with this detail. – bsheps Mar 09 '19 at 21:03
  • 1
    @bsheps: To be honest they're numbers copied from Josh Bloch. There are whole threads elsewhere about the choice of numbers for this approach. – Jon Skeet Mar 09 '19 at 23:05
44

sometimes the hashcode calculation itself goes beyond the Integer.MAX_VALUE, i.e 2147483647. what happens then is that we get a negative integer after the overflow. Negative hashcode is perfectly valid!

dantiston
  • 5,161
  • 2
  • 26
  • 30
Pravat Panda
  • 1,060
  • 2
  • 13
  • 27
25

It is perfectly legal to have negative hash codes, and if you are looking for hash values as used in hash-based collections you can use Math.abs(hash). This can also give you negative numbers when hash is bigger than 2^31, and the best way would be to use a shift mask (key.hashCode() & 0x7fffffff) % M, where M is the table size.

Sebastian
  • 368
  • 1
  • 12
Milky
  • 381
  • 3
  • 6
  • 1
    I don't understand why you wouldn't just use Math.abs(hash). It's my understanding that Math.abs() will only return negative for int.MIN_VALUE. If hash = key.hashCode() % M then the only way to end up with hash == int.MIN_VALUE is if M > int.MAX_VALUE, in which case you'd need to be using longs to index the table anyway. – jkindwall Nov 21 '15 at 05:00
  • 3
    By "bigger than 2^31", this answer really means "more than 31 binary digits", not a larger *integer* than 2^31. Why `(key.hashCode() & 0x7fffffff)`? Because it is a simple 1-step binary operation on the result of `hashCode()` that should (or could) execute faster than `Math.abs()`. – Ogre Psalm33 Jul 20 '17 at 15:23