Short answer: a dictionary lookup first does a (cheap) reference equality check (x is y
) when searching a bucket, and only if that fails, a (more expensive) equality check (x == y
) is done.
Scenario
The __hash__
function does not call __eq__
internally. Given you construct bob
and jim
, no such methods are called.
Next you associate bob
with 'tomorrow'
. In order to know in which bucket of the dictionary, you have to store bob
, you calculate the hash. Now once you have done that we store bob
(and the value in the correct bucket).
Next we want to obtain jim
. In order to know in which bucket jim
resides, we calculate the hash. Next we start searching in the bucket. The bucket will contain bob
. We first perform a reference check (jim is bob
) but that fails, so then we fallback on the equality check. That check succeeds, so we return the value corresponding with bob
: 'tomorrow'
.
The same scenario happens when we want to look for bob
: we calculate the hash, fetch the bucket. Perform a reference check on bob is bob
, and that one succeeds. So we do not need a (probably more expensive equality check). We simply return the value 'tomorrow'
.
Reference checks
The fact that a reference check is done first can be proven with the following (unhealthy) code:
class Person(object):
def __init__(self, name, ssn, address):
self.name = name
self.ssn = ssn
self.address = address
def __hash__(self):
print('in hash')
return hash(self.ssn)
def __eq__(self, other):
print('in eq')
return False
Here we return always False
for equality. So even:
>>> bob == bob
in eq
False
>>> bob is bob
True
bob
is not equal to itself (this is actually not good design, since for a dictionary, it is a contract that an object is equal to itself: a good equality relation is reflexive, symmetrical and transitive). Nevertheless, if we associate bob
with 'tomorrow'
, we are still able to fetch the value associated with bob
:
>>> dmv_appointments = {}
>>> dmv_appointments[bob] = 'tomorrow'
in hash
>>> dmv_appointments[bob]
in hash
'tomorrow'