11

I have a problem with the Python MRO For this code:

class F: pass 
class G: pass 
class H: pass
class E(G,H): pass
class D(E,F): pass 
class C(E,G): pass
class B(C,H): pass
class A(D,B,E): pass

print(A.__mro__)

I get this output:

(<class '__main__.A'>, <class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.E'>, <class '__main__.G'>, <class '__main__.H'>, <class '__main__.F'>, <class 'object'>)

Why do I get <class '__main__.C'> before <class '__main__.E'>?

I thought it would be:

  1. A
  2. D,B,E
  3. E,F | C,H | G,H

and so on

martineau
  • 119,623
  • 25
  • 170
  • 301
Fleksiu
  • 191
  • 1
  • 10

1 Answers1

23

In short because C depends on E as you can see in the dependency-graph (O is object):

enter image description here

Python's method resolution order (MRO) works with the constraint that if a class a is a dependency of a class b, it is placed later in the queue than b.

Now more to the theory:

In Python, the MRO works with the following linearization rule:

L[C(B1 ... Bn)] = C + merge(L[B1] ... L[Bn], B1 ... Bn); and

L[object] = object

(source)

And the merge is defined as:

take the head of the first list, i.e L[B1][0]; if this head is not in the tail of any of the other lists, then add it to the linearization of C and remove it from the lists in the merge, otherwise look at the head of the next list and take it, if it is a good head. Then repeat the operation until all the class are removed or it is impossible to find good heads. In this case, it is impossible to construct the merge, Python 2.3 will refuse to create the class C and will raise an exception.

(source)

So for your case, it is, the first step is:

L[A] = A + merge(L[D],L[B],L[E])

Let us first resolve the recursive calls:

L[D] = D + merge(L[E],L[F]);
L[B] = B + merge(L[C],L[H]); and
L[E] = E + merge(L[G],L[H]).

And more recursion (we only do H once and do not redo E):

L[F] = F + merge(L[O]);
L[C] = C + merge(L[E],L[G]);
L[G] = G + merge(L[O]); and
L[H] = H + merge(L[O]).

Since L[O] is O and merge(a) (for one object is a) we thus have already obtained the sequence for H, G and F:

L[H] = (H, O).
L[G] = (G, O).
L[F] = (F, O).

Now we can calculate L[E]:

L[E] = E + merge( (G,O) , (H,O) ).

Since O is for both in the tail, it is placed last:

L[E] = (E,G,H,O).

Now we can calculate L[C]:

L[C] = C + merge( (E,G,H,O) , (G,O) );
L[C] = (C,E) + merge( (G,H,O) , (G,O) );
L[C] = (C,E,G) + merge( (H,O) , (O) );
L[C] = (C,E,G,H) + merge( (O) , (O) );
*L[C] = (C,E,G,H,O).

And L[D]:

L[D] = D + merge( (E,G,H,O) , (F,O) );
..;
L[D] = (D,E,G,H,F,O).

Next L[B] can be fully resolved:

L[B] = B + merge( (C,E,G,H,O) , (H,O) );
..;
L[B] = (B,C,E,G,H,O).

And now we can finally resolved:

L[A] = A + merge( (D,E,G,H,F,O) , (B,C,E,G,H,O) , (E,G,H,O) );
L[A] = (A,D) + merge( (E,G,H,F,O) , (B,C,E,G,H,O) , (E,G,H,O) );
L[A] = (A,D,B) + merge( (E,G,H,F,O) , (C,E,G,H,O) , (E,G,H,O) );
L[A] = (A,D,B,C) + merge( (E,G,H,F,O) , (E,G,H,O) , (E,G,H,O) );
L[A] = (A,D,B,C,E) + merge( (G,H,F,O) , (G,H,O) , (G,H,O) );
L[A] = (A,D,B,C,E,G) + merge( (H,F,O) , (H,O) , (H,O) );
L[A] = (A,D,B,C,E,G,H) + merge( (F,O) , (O) , (O) );
L[A] = (A,D,B,C,E,G,H,F) + merge( (O) , (O) , (O) );
L[A] = (A,D,B,C,E,G,H,F,O).

Which is the expected behavior.

A not efficient merge function I've made can be used for educational purposes, it is definitely not optimized for production:

def mro_merge(*args):
    for i,arg in enumerate(args):
        if len(arg) > 0:
            head = arg[0]
            for argb in args:
                if head in argb[1:]:
                    break
            else:
                newargs = tuple(argb if len(argb) > 0 and argb[0] != head else argb[1:] for argb in args)
                print('mro_merge(%s) = %s + mro_merge(%s)'%(args,head,newargs))
                yield head
                for x in mro_merge(*newargs):
                    yield x
                break

And when you call it, it generates:

>>> list(mro_merge(('G','O'),('H','O')))
mro_merge((('G', 'O'), ('H', 'O'))) = G + mro_merge((('O',), ('H', 'O')))
mro_merge((('O',), ('H', 'O'))) = H + mro_merge((('O',), ('O',)))
mro_merge((('O',), ('O',))) = O + mro_merge(((), ()))
['G', 'H', 'O']
>>> list(mro_merge( ('D','E','G','H','F','O') , ('B','C','E','G','H','O') , ('E','G','H','O') ))
mro_merge((('D', 'E', 'G', 'H', 'F', 'O'), ('B', 'C', 'E', 'G', 'H', 'O'), ('E', 'G', 'H', 'O'))) = D + mro_merge((('E', 'G', 'H', 'F', 'O'), ('B', 'C', 'E', 'G', 'H', 'O'), ('E', 'G', 'H', 'O')))
mro_merge((('E', 'G', 'H', 'F', 'O'), ('B', 'C', 'E', 'G', 'H', 'O'), ('E', 'G', 'H', 'O'))) = B + mro_merge((('E', 'G', 'H', 'F', 'O'), ('C', 'E', 'G', 'H', 'O'), ('E', 'G', 'H', 'O')))
mro_merge((('E', 'G', 'H', 'F', 'O'), ('C', 'E', 'G', 'H', 'O'), ('E', 'G', 'H', 'O'))) = C + mro_merge((('E', 'G', 'H', 'F', 'O'), ('E', 'G', 'H', 'O'), ('E', 'G', 'H', 'O')))
mro_merge((('E', 'G', 'H', 'F', 'O'), ('E', 'G', 'H', 'O'), ('E', 'G', 'H', 'O'))) = E + mro_merge((('G', 'H', 'F', 'O'), ('G', 'H', 'O'), ('G', 'H', 'O')))
mro_merge((('G', 'H', 'F', 'O'), ('G', 'H', 'O'), ('G', 'H', 'O'))) = G + mro_merge((('H', 'F', 'O'), ('H', 'O'), ('H', 'O')))
mro_merge((('H', 'F', 'O'), ('H', 'O'), ('H', 'O'))) = H + mro_merge((('F', 'O'), ('O',), ('O',)))
mro_merge((('F', 'O'), ('O',), ('O',))) = F + mro_merge((('O',), ('O',), ('O',)))
mro_merge((('O',), ('O',), ('O',))) = O + mro_merge(((), (), ()))
['D', 'B', 'C', 'E', 'G', 'H', 'F', 'O']
Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
  • It would be nice if you added the source of your quotation. – Georg Schölly Jan 26 '17 at 17:19
  • 1
    I think `L[D] = A + merge(L[E],L[F]);` should be `L[D] = D + merge(L[E],L[F]);` right? – Takuro Wada Jan 29 '17 at 10:20
  • 1
    @TakuroWada: yes that was a copy-paste error. Well spotted :). – Willem Van Onsem Jan 29 '17 at 10:50
  • Willem: Are you confident that your `mro_merge()` function produces the same results as what Python does internally? I ask because I'm trying to write a metaclass that does something to each of the classes in the MRO list, but needs to do so _before_ the class is created (so can't call `mro()` or use `__mro__` on it because it doesn't exist yet). – martineau Jul 30 '17 at 17:52