Why for all int32s, does hash(x) == x except when x = -1?

Question

Note this interesting behavior of the hash() builtin:

>>> [x for x in range(-2**31, 2**31) if x != hash(x)]
[-1]
>>> hash(-1)
-2

This is on Python 3.6.2 (it also seems the same behaviors occurs on 2.7.10). The only number in [-2**31, 2**31) that doesn't hash to itself is -1. Why is this?

I've dug into the Python source to see if I could uncover the reason, but came up with a few places which suggest that hash(-1) = -1 should hold:

https://github.com/python/cpython/blob/3.6/Python/pyhash.c#L34-L83:

/* For numeric types, the hash of a number x is based on the reduction
   of x modulo the prime P = 2**_PyHASH_BITS - 1.  It's designed so that
   hash(x) == hash(y) whenever x and y are numerically equal, even if
   x and y have different types.
   A quick summary of the hashing strategy:
   (1) First define the 'reduction of x modulo P' for any rational
   number x; this is a standard extension of the usual notion of
   reduction modulo P for integers.  If x == p/q (written in lowest
   terms), the reduction is interpreted as the reduction of p times
   the inverse of the reduction of q, all modulo P; if q is exactly
   divisible by P then define the reduction to be infinity.  So we've
   got a well-defined map
      reduce : { rational numbers } -> { 0, 1, 2, ..., P-1, infinity }.
   (2) Now for a rational number x, define hash(x) by:
      reduce(x)   if x >= 0
      -reduce(-x) if x < 0
*/

Note above that for negative numbers hash(x) = -reduce(-x).

It seems like on x86_64 the hashes are uint64s and the above invariant holds for approximately the range [-2**62, 2**62). Why is -1 special in that it (and -2, technically) are the only integers to collide in this range?

Happens in 2.7.12 as well. May be a bug, it seems to violate the contention that `reduce(n where n < 0) == -reduce(-n)`. — paxdiablo, Oct 27 '17 at 06:03
`The hash value -1 is reserved (it’s used to flag errors in the C implementation). If the hash algorithm generates this value, we simply use -2 instead.` http://effbot.org/zone/python-hash.htm — Kinght 金, Oct 27 '17 at 06:27

Why for all int32s, does hash(x) == x except when x = -1?

0 Answers0