20

In python 2.7.6, suppose that I have a class that defines __eq__ and a subclass thereof:

>>> class A(object):
...     def __eq__(self,other):
...         print self.__class__,other.__class__
...         return True
... 
>>> class B(A):
...     pass
... 

Now I create an object of each class, and want to compare them:

>>> a = A()
>>> b = B()
>>> a==b

The result I get:

<class '__main__.B'> <class '__main__.A'>

This shows that the interpreter is calling b.__eq__(a), instead of a.__eq__(b) as expected.

The documentation states (emphasis added):

  • For objects x and y, first x.__op__(y) is tried. If this is not implemented or returns NotImplemented, y.__rop__(x) is tried. If this is also not implemented or returns NotImplemented, a TypeError exception is raised. But see the following exception:

  • Exception to the previous item: if the left operand is an instance of a built-in type or a new-style class, and the right operand is an instance of a proper subclass of that type or class and overrides the base’s __rop__() method, the right operand’s __rop__() method is tried before the left operand’s __op__() method.

    This is done so that a subclass can completely override binary operators. Otherwise, the left operand’s __op__() method would always accept the right operand: when an instance of a given class is expected, an instance of a subclass of that class is always acceptable.

Since the subclass B does not override the __eq__ operator, shouldn't a.__eq__(b) be called instead of b.__eq__(a)? Is this expected behavior, or a bug? It is contrary to the documentation as I read it: am I misreading the documentation or missing something else?

Some related questions:

  • This answer quotes the documentation that I quoted above. In that case the final question involved a comparison between an object of built in type (1) and an an instance of a new style class. Here, I'm specifically comparing an instance of a parent class with an instance of a subclass which does not override the rop() method of its parent (in this case, __eq__ is both op() and rop()).

    In this case, python actually does call b.__eq__(a) instead of a.__eq__(b) first, even though class B does not explicitly override A.

Community
  • 1
  • 1
stochastic
  • 3,155
  • 5
  • 27
  • 42
  • I don't have time to dig into this myself at the moment, but I believe [this](http://hg.python.org/cpython/file/d08d1569aa04/Objects/object.c#l633) is the relevant C-code (as of Python 3.5). – dano Jul 23 '14 at 20:28
  • I agree that this looks like a bug - at the very least, a documentation bug. You'd probably need a good use-case to convince the Python folks to change the behaviour rather than the docs, though. In case you're interested, the logic is in `do_richcompare` in `Objects/object.c`. [Here](https://github.com/python/cpython/blob/master/Objects/object.c#L635) is the current code. – Mark Dickinson Jul 23 '14 at 20:28
  • Darn: dano beat me to it by a few seconds! Anyway, the question is definitely not a duplicate of the one linked to. – Mark Dickinson Jul 23 '14 at 20:30
  • @stochastic: do you want to open a bug report? – Mark Dickinson Jul 23 '14 at 20:47
  • 1
    Opened http://bugs.python.org/issue22052. It's not 100% clear to me that this *is* a doc bug, since the portion of the docs you quote comes from the 'emulating numeric types' section, which doesn't cover the comparison operators. Nevertheless, it wouldn't harm to make the docs clearer on this point. Nice catch, by the way: after almost 20 years of using Python (and a few of those years helping develop it) I thought I knew the core language pretty well, but it seems it still has the power to surprise! – Mark Dickinson Jul 23 '14 at 21:06
  • @Mark Dickinson: Thanks. I've just updated my post: the behavior is as documented if run from a file, but not from an interactive session. That would be a bug, and perhaps a separate one? – stochastic Jul 23 '14 at 21:31
  • @Mark Dickinson: I think you are mistaken about the docs. What I quoted comes entirely from Section "3.4.9 Coercion Rules", not the preceeding section "3.4.8 Emulating Numeric Types". This behavior is clearly a documentation bug (or perhaps a behavior bug, since it is different interactively vs from a file) – stochastic Jul 23 '14 at 21:33
  • @stochastic: Hmm, yes. I was looking at the 3.x docs. I didn't realise it was in a different section in the 2.x docs. Thanks! – Mark Dickinson Jul 23 '14 at 21:42
  • Wait, what? You really get different behaviour from a file? That would be peculiar indeed. Are you sure you have the exact same code (in particular, `B` inheriting from `A`) in the file version? – Mark Dickinson Jul 23 '14 at 21:50
  • I went to verify, and found that I had a crucial typo in the file. Just kidding on the different behavior from a file :-). Sorry. – stochastic Jul 23 '14 at 21:54

2 Answers2

15

It appears that a subclass is considered to "override" the superclass behavior even if all it does is inherit the superclass behavior. This is difficult to see in the __eq__ case because __eq__ is its own reflection, but you can see it more clearly if you use different operators, such as __lt__ and __gt__, which are each other's reflections:

class A(object):
    def __gt__(self,other):
        print "GT", self.__class__, other.__class__

    def __lt__(self,other):
        print "LT", self.__class__, other.__class__

class B(A):
    pass

Then:

>>> A() > B()
LT <class '__main__.B'> <class '__main__.A'>

Note that A.__gt__ was not called; instead, B.__lt__ was called.

The Python 3 documentation is illustrative, in that it states the rule in different words that are technically more accurate (emphasis added):

If the right operand’s type is a subclass of the left operand’s type and that subclass provides the reflected method for the operation, this method will be called before the left operand’s non-reflected method. This behavior allows subclasses to override their ancestors’ operations.

The subclass does indeed "provide" the reflected method, it just provides it via inheritance. If you actually remove the reflected method behavior in the subclass (by returning NotImplemented), the superclass method is correctly called (after the subclass one):

class A(object):
    def __gt__(self,other):
        print "GT", self.__class__, other.__class__

    def __lt__(self,other):
        print "LT", self.__class__, other.__class__

class B(A):
    def __lt__(self, other):
        print "LT", self.__class__, other.__class__
        return NotImplemented

>>> A() > B()
LT <class '__main__.B'> <class '__main__.A'>
GT <class '__main__.A'> <class '__main__.B'>

So basically this appears to be a documentation bug. It should say that the subclass reflected method is always tried first (for comparison operators), regardless of whether the subclass explicitly overrides the superclass implementation. (As noted by Mark Dickinson in a comment, though, it only works this way for comparison operators, not for the mathematical operator pairs like __add__/__radd__.)

In practice, this is unlikely to matter, since the only time you notice it is when the subclass doesn't override the superclass. But in that case, the subclass behavior is by definition the same as the superclass's anyway, so it doesn't really matter which one is called (unless you're doing something dangerous like mutating the object from within the comparison method, in which case you should have been on your guard anyway).

BrenBarn
  • 242,874
  • 37
  • 412
  • 384
  • 3
    The behaviour is still different for comparison operators than for regular arithmetic operators, though: if `A` defines both `__add__` and `__radd__`, then `A() + B()` still calls `A`'s method first. So your interpretation of `provides` (via inheritance) doesn't apply in that case. – Mark Dickinson Jul 23 '14 at 20:39
  • @MarkDickinson: Good point, I added a note about that to my answer. – BrenBarn Jul 23 '14 at 20:48
  • 1
    > is not the inverse of –  Jul 23 '14 at 23:56
  • Thanks for this. The "dangerous" thing that I'm currently doing, is building up an expression tree when the user writes `a < b`, for later evaluation. So ideally I don't want the code `a < b` to generate the expression `b > a`, even though it's equivalent, because it's harder for the user to recognise it later when printed in string form. Your proposed fix makes my tests pass :-) – Steve Jessop Jul 06 '20 at 16:46
7

Here's the code that implements the described logic:

Python 2.7:

/* Macro to get the tp_richcompare field of a type if defined */
#define RICHCOMPARE(t) (PyType_HasFeature((t), Py_TPFLAGS_HAVE_RICHCOMPARE) \
             ? (t)->tp_richcompare : NULL)

...

static PyObject *
try_rich_compare(PyObject *v, PyObject *w, int op)
{
    richcmpfunc f;
    PyObject *res;

    if (v->ob_type != w->ob_type &&
        PyType_IsSubtype(w->ob_type, v->ob_type) &&
        (f = RICHCOMPARE(w->ob_type)) != NULL) {
        res = (*f)(w, v, _Py_SwappedOp[op]);  // We're executing this
        if (res != Py_NotImplemented)
            return res;
        Py_DECREF(res);
    }
    if ((f = RICHCOMPARE(v->ob_type)) != NULL) {
        res = (*f)(v, w, op);  // Instead of this.
        if (res != Py_NotImplemented)
            return res;
        Py_DECREF(res);
    }
    if ((f = RICHCOMPARE(w->ob_type)) != NULL) {
        return (*f)(w, v, _Py_SwappedOp[op]);
    }
    res = Py_NotImplemented;
    Py_INCREF(res);
    return res;
}

Python 3.x:

/* Perform a rich comparison, raising TypeError when the requested comparison
   operator is not supported. */
static PyObject *
do_richcompare(PyObject *v, PyObject *w, int op)
{
    richcmpfunc f;
    PyObject *res;
    int checked_reverse_op = 0; 

    if (v->ob_type != w->ob_type &&
        PyType_IsSubtype(w->ob_type, v->ob_type) &&
        (f = w->ob_type->tp_richcompare) != NULL) {
        checked_reverse_op = 1; 
        res = (*f)(w, v, _Py_SwappedOp[op]);  // We're executing this
        if (res != Py_NotImplemented)
            return res; 
        Py_DECREF(res);
    }    
    if ((f = v->ob_type->tp_richcompare) != NULL) {
        res = (*f)(v, w, op);   // Instead of this.
        if (res != Py_NotImplemented)
            return res; 
        Py_DECREF(res);
    }    
    if (!checked_reverse_op && (f = w->ob_type->tp_richcompare) != NULL) {
        res = (*f)(w, v, _Py_SwappedOp[op]);
        if (res != Py_NotImplemented)
            return res; 
        Py_DECREF(res);
    }    

The two version are similar, except that the Python 2.7 version uses a RICHCOMPARE macro that checks PyType_HasFeature((t), Py_TPFLAGS_HAVE_RICHCOMPARE instead of ob_type->tp_richcompare != NULL.

In both versions, the first if block is evaluating to true. The specific piece that one would perhaps expect to be false, going by the description in the docs, is this: f = w->ob_type->tp_richcompare != NULL (for Py3) / PyType_HasFeature((t), Py_TPFLAGS_HAVE_RICHCOMPARE. However, the docs say that tp_richcompare is inherited by child classes:

richcmpfunc PyTypeObject.tp_richcompare

An optional pointer to the rich comparison function...

This field is inherited by subtypes together with tp_compare and tp_hash...

With the 2.x version, PyType_HasFeature((t), Py_TPFLAGS_HAVE_RICHCOMPARE will also evaluate to true, because the Py_TPFLAGS_HAVE_RICHCOMPARE flag is true if tp_richcompare, tp_clear, and tp_traverse are true, and all of those are inherited from the parent.

So, even though B doesn't provide its own rich comparison method, it still returns a non-NULL value because its parent class provides it. As others have stated, this seems to be a doc bug; the child class doesn't actually need to override the __eq__ method of the parent, it just needs to provide one, even via inheritance.

Community
  • 1
  • 1
dano
  • 91,354
  • 19
  • 222
  • 219