3

If I compare two variables using ==, does Python compare the identities, and, if they're not the same, then compare the values?

For example, I have two strings which point to the same string object:

>>> a = 'a sequence of chars'
>>> b = a

Does this compare the values, or just the ids?:

>>> b == a
True

It would make sense to compare identity first, and I guess that is the case, but I haven't yet found anything in the documentation to support this. The closest I've got is this:

x==y calls x.__eq__(y)

which doesn't tell me whether anything is done before calling x.__eq__(y).

smci
  • 32,567
  • 20
  • 113
  • 146
Peter Wood
  • 23,859
  • 5
  • 60
  • 99

2 Answers2

6

For user-defined class instances, is is used as a fallback - where the default __eq__ isn't overridden, a == b is evaluated as a is b. This ensures that the comparison will always have a result (except in the NotImplemented case, where comparison is explicitly forbidden).

This is (somewhat obliquely - good spot Sven Marnach) referred to in the data model documentation (emphasis mine):

User-defined classes have __eq__() and __hash__() methods by default; with them, all objects compare unequal (except with themselves) and x.__hash__() returns an appropriate value such that x == y implies both that x is y and hash(x) == hash(y).


You can demonstrate it as follows:

>>> class Unequal(object):
    def __eq__(self, other):
        return False


>>> ue = Unequal()
>>> ue is ue
True
>>> ue == ue
False

so __eq__ must be called before id, but:

>>> class NoEqual(object):
    pass

>>> ne = NoEqual()
>>> ne is ne
True
>>> ne == ne
True

so id must be invoked where __eq__ isn't defined.


You can see this in the CPython implementation, which notes:

/* If neither object implements it, provide a sensible default
   for == and !=, but raise an exception for ordering. */

The "sensible default" implemented is a C-level equality comparison of the pointers v and w, which will return whether or not they point to the same object.

Community
  • 1
  • 1
jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
  • So, if we are implementing `__eq__`, and it makes sense, should we first check `id`? – Peter Wood Oct 08 '15 at 09:45
  • @PeterWood you could do that, yes - where the comparison is complex, it would certainly be more efficient – jonrsharpe Oct 08 '15 at 09:46
  • @jonrsharpe The Cpython line explains that! – Mazdak Oct 08 '15 at 09:47
  • I checked the code for strings. Python [checks `is` first](https://hg.python.org/cpython/file/tip/Objects/unicodeobject.c#l10943). Many thanks for the direction. – Peter Wood Oct 08 '15 at 09:52
  • @PeterWood no problem - you'll probably find that the other built-in types implement that too, where appropriate. – jonrsharpe Oct 08 '15 at 09:53
  • [Python 2.7](https://hg.python.org/cpython/file/2.7/Objects/object.c#l768) also falls back to pointer comparison, nice find. – Gall Oct 08 '15 at 09:58
  • 2
    Falling back to object identity for user-defined classes isn't a CPython implementation detail. It's part of the language specification, but documented in [a slightly unexpected place](https://docs.python.org/2/reference/datamodel.html#object.__hash__). – Sven Marnach Oct 08 '15 at 10:30
  • @SvenMarnach aha! Thank you – jonrsharpe Oct 08 '15 at 10:35
3

In addition to the answer by @jonrsharpe: if the objects being compared implement __eq__, it would be wrong for Python to check for identity first.

Look at the following example:

>>> x = float('nan')
>>> x is x 
True
>>> x == x
False

NaN is a specific thing that should never compare equal to itself; however, even in this case x is x should return True, because of the semantics of is.

publysher
  • 11,214
  • 1
  • 23
  • 28
  • Well, yeah, NaN is kind of an exception. There are many cases where Python actually _does_ assume that object identity implies equality. These cases often result in unexpected behaviour for `float('nan')`, like using [NaN as a dictionary key](http://stackoverflow.com/questions/6441857/nans-as-key-in-dictionaries) or checking `float('nan') in list_of_floats` (the latter will only yield `True` if the exact `float('nan')` object you are checking for is in the list, not any NaN). So arguing that it would be "wrong" for Python to check for identity first seems a bit off. – Sven Marnach Oct 08 '15 at 10:28