131

The hashCode() method of class Boolean is implemented like this:

public int hashCode() {
    return value ? 1231 : 1237;
}

Why does it use 1231 and 1237? Why not something else?

Andrii Abramov
  • 10,019
  • 9
  • 74
  • 96
user471011
  • 7,104
  • 17
  • 69
  • 97
  • 1
    These two numbers are sufficiently big prime numbers. Please read the [article on Hash Table](http://en.wikipedia.org/wiki/Hash_table) on Wikipedia for further info. – Boris Pavlović Oct 12 '10 at 07:05

2 Answers2

153

1231 and 1237 are just two (sufficiently large) arbitrary prime numbers. Any other two large prime numbers would do fine.

Why primes?
Suppose for a second that we picked composite numbers (non-primes), say 1000 and 2000. When inserting booleans into a hash table, true and false would go into bucket 1000 % N resp 2000 % N (where N is the number of buckets).

Now notice that

  • 1000 % 8 same bucket as 2000 % 8
  • 1000 % 10 same bucket as 2000 % 10
  • 1000 % 20 same bucket as 2000 % 20
  • ....

in other words, it would lead to many collisions.

This is because the factorization of 1000 (23, 53) and the factorization of 2000 (24, 53) have so many common factors. Thus prime numbers are chosen, since they are unlikely to have any common factors with the bucket size.

Why large primes. Wouldn't 2 and 3 do?
When computing hash codes for composite objects it's common to add the hash codes for the components. If too small values are used in a hash set with a large number of buckets there's a risk of ending up with an uneven distribution of objects.

Do collisions matter? Booleans just have two different values anyway?
Maps can contain booleans together with other objects. Also, as pointed out by Drunix, a common way to create hash functions of composite objects is to reuse the subcomponents hash code implementations in which case it's good to return large primes.

Related questions:

Community
  • 1
  • 1
aioobe
  • 413,195
  • 112
  • 811
  • 826
  • `why not other large prime nos used?` Here is list of all prime no? http://mindprod.com/jgloss/primes.html under 10,000, can you please explain this – jmj Oct 12 '10 at 07:14
  • 1
    I suppose these are sufficiently large. To get a gcd greater than 1, you'd need at least `2*1231 = 2462` buckets. Are collisions a problem in such a situation? – aioobe Oct 12 '10 at 07:16
  • 2
    Interesting though that they are not really "fairly large" considering what can fit into an int. I suppose they are just big enough to work well with the JDK Hashtable, but still small enough to minimize calculation costs. – Thilo Oct 12 '10 at 07:45
  • 2
    Yes, it struck me too that they're not *that* large. But do you believe there is a higher cost with larger primes? – aioobe Oct 12 '10 at 07:52
  • 3
    @Thilo you'd need a multiple of 1231*1237 = 1,522,747 buckets before they would collide, that is plenty large enough – ratchet freak Mar 27 '14 at 13:25
  • 2
    I would say that leading to collisions with bucket count is not really a problem with boolean, but more the common construction on how we get the hascode of a composite object, namely by multiplying the hashcodes of the components with some constants and adding them up. – Drunix Mar 28 '14 at 07:27
  • I know this is an old question but I don't follow why you would have collisions if your bucket size is 6 and your primes are 2 and 3. Surely 2 % 6 != 3 % 6 and your two boolean values will only ever go into two buckets. – Chris Mar 28 '14 at 15:29
  • 1
    @Chris, made a mistake. Thanks for pointing it out. Answer updated. – aioobe Mar 28 '14 at 18:57
  • @aioobe Don't you think 121 and 127 would be large enough, and better choice as they are cached ? Is there any indication in the spec of a lowest *"correct"* prime number ? – Jean-François Savard Apr 01 '15 at 21:42
  • You want hash codes to be as evenly spread across Integer.MIN_VALUE to Integer.MAX_VALUE. When aggregating hash codes (for instance by multiplying by "sub" hashcodes) 121 and 127 wouldn't "reach" that far. I realize that 1231 and 1237 doesn't reach *that* much further, but that's the best explanation I got. – aioobe Apr 02 '15 at 05:17
4

In addition to all that's said above it can also be a small easter egg from developers:

true: 1231 => 1 + 2 + 3 + 1 = 7

7 - is a lucky number in European traditions;

false: 1237 => 1 + 2 + 3 + 7 = 13

13 (aka Devil's dozen) - unlucky number.

veben
  • 19,637
  • 14
  • 60
  • 80
bvp
  • 49
  • 1