2

"random".hashCode() returns a value -938285885. Are negative values expected for hashCode()?

According to the following question, there's a way the hashCode() for string is computed, but using that, won't the value keep increasing as the length of string increase and eventually be greater than Integer.MAX_VALUE?

Roman C
  • 49,761
  • 33
  • 66
  • 176
jCoder
  • 203
  • 3
  • 9
  • Why do you think it shouldn't/couldn't return a negative value? Explain your expectations. – Sotirios Delimanolis Jan 14 '15 at 16:17
  • No int can ever be greater than `Integer.MAX_VALUE`. That's what "max value" means. When the numeric result of an int calculation is greater than Integer.MAX_VALUE, an _arithmetic overflow_ occurs. (Google it). Sometimes an overflow is a Bad Thing, but when computing hash codes, it's just a normal part of the computation. – Solomon Slow Jan 14 '15 at 16:19
  • `"random".hashCode()` is the same as `(((('r'*31+'a')*31+'n')*31+'d')*31+'o')*31+'m'`. The `hashCode` for `"rando"` is positive, but the fifth multiplication by 31 causes an overflow and so the result is negative. – Paul Boddington Jan 14 '15 at 16:32

1 Answers1

10

Are negative values expected for hashCode()?

They're entirely valid, yes.

won't the value keep increasing as the length of string increase and eventually be greater than Integer.MAX_VALUE?

What makes you think that hash codes increase as the length of the string increases?

Basically, you should think of hash codes as fingerprints - collections of bits rather than numbers with a meaningful magnitude. Hash code calculations very often overflow or underflow, and that's absolutely fine. "More" or "less" are irrelevant comparisons between hash codes - all that's relevant is "equal" or "not equal", where the rules are that the hash codes for two equal values must be equal, but the hash codes for two non-equal values may still be equal. The numeric values are relevant in terms of bucketing, but that's usually an implementation detail of whatever's using them.

A hash code is just a quick way of finding values which are definitely not equal. So consider a situation where I have a set of strings with hash codes { 1, -15, 20, 5, 100 }. If I'm given a string with a hash code of 14, I know that string definitely isn't in the set. If I'm given a string with a hash code of 20, I need to check it with equals against the string in my set with a hash code of 20, to see whether or not the candidate string is in the set.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194