26

I've been reading into how super() works. I came across this recipe that demonstrates how to create an Ordered Counter:

from collections import Counter, OrderedDict

class OrderedCounter(Counter, OrderedDict):
     'Counter that remembers the order elements are first seen'
     def __repr__(self):
         return '%s(%r)' % (self.__class__.__name__,
                            OrderedDict(self))
     def __reduce__(self):
         return self.__class__, (OrderedDict(self),)

For example:

oc = OrderedCounter('adddddbracadabra')

print(oc)

OrderedCounter(OrderedDict([('a', 5), ('d', 6), ('b', 2), ('r', 2), ('c', 1)]))

Is someone able to explain how this magically works?

This also appears in the Python documentation.

Sean
  • 963
  • 1
  • 10
  • 28
  • Can you be a bit more specific about what you don't understand? – mgilson Feb 17 '16 at 01:02
  • I see that the class inherits from Counter, and OrderedDict, but I can't see how these classes are combined to produce an OrderedCounter...I'm trying to understand what makes it work - for example, is there something special about Counter? Does this help clarify what I'm asking? – Sean Feb 17 '16 at 01:05
  • basically ,as a general rule of thumb, avoid multiple inheritance (your co-workers will not kill you this way) ... (of coarse mixins are a little different) – Joran Beasley Feb 17 '16 at 01:05
  • 2
    The example that you posted above is missing the __init__ method that is present in the documentation you linked to, and I'm not sure, but I would expect that to be important to making it work. You can see that you override two methods. Any method not overriden will be inherited from _Count_ and if not present will be inherited from _OrderedDict_. If any method is on one of their superclasses it gets a little trickier. As @JoranBeasley said "Avoid multiple inheritence" - the complex inheritance tree is why you want to avoid it (it leads to unexpected results). – Matthew Feb 17 '16 at 01:09
  • 2
    Yeah, it all has to do with which `dict` methods `Counter` actually overrides/customizes. I can't figure out exactly which those would be, but I suspect `Counter` leaves `__setitem__` and `__iter__` untouched, so your new class gets those from `OrderedDict` and that's enough to give you the ordered behaviour. – Marius Feb 17 '16 at 01:13
  • @Matthew there is an `__init__` method in the python docs, but no `__init__` method in Raymond Hettinger's article. From my testing it works without the `__init__`. – Sean Feb 17 '16 at 01:34
  • oh, maybe this video may be interesting: [Raymond Hettinger - Super considered super! - PyCon 2015](https://www.youtube.com/watch?v=EiOglTERPEo) – Copperfield Feb 17 '16 at 01:41

3 Answers3

40

OrderedCounter is given as an example in the OrderedDict documentation, and works without needing to override any methods:

class OrderedCounter(Counter, OrderedDict):
    pass

When a class method is called, Python has to find the correct method to execute. There is a defined order in which it searches the class hierarchy called the "method resolution order" or mro. The mro is stored in the attribute __mro__:

OrderedCounter.__mro__

(<class '__main__.OrderedCounter'>, <class 'collections.Counter'>, <class 'collections.OrderedDict'>, <class 'dict'>, <class 'object'>)

When an instance of an OrderedDict is calling __setitem__(), it searches the classes in order: OrderedCounter, Counter, OrderedDict (where it is found). So an statement like oc['a'] = 0 ends up calling OrderedDict.__setitem__().

In contrast, __getitem__ is not overridden by any of the subclasses in the mro, so count = oc['a'] is handled by dict.__getitem__().

oc = OrderedCounter()    
oc['a'] = 1             # this call uses OrderedDict.__setitem__
count = oc['a']         # this call uses dict.__getitem__

A more interesting call sequence occurs for a statement like oc.update('foobar'). First, Counter.update() gets called. The code for Counter.update() uses self[elem], which gets turned into a call to OrderedDict.__setitem__(). And the code for that calls dict.__setitem__().

If the base classes are reversed, it no longer works. Because the mro is different and the wrong methods get called.

class OrderedCounter(OrderedDict, Counter):   # <<<== doesn't work
    pass

More info on mro can be found in the Python 2.3 documentation.

MSeifert
  • 145,886
  • 38
  • 333
  • 352
RootTwo
  • 4,288
  • 1
  • 11
  • 15
2

I think we need to represent those methods repr and reduce in the class when words are given as input.

Without repr and reduce:

from collections import Counter, OrderedDict
class OrderedCounter(Counter, OrderedDict):
    pass

oc = OrderedCounter(['apple', 'banana', 'cherry', 'mango', 'apple', 'pie', 'mango'])
print(oc)

Output:

OrderedCounter({'apple': 2, 'mango': 2, 'banana': 1, 'cherry': 1, 'pie': 1})

The order in the above example is not preserved.

With repr and reduce:

from collections import Counter, OrderedDict
class OrderedCounter(Counter, OrderedDict):
    'Counter that remembers the order elements are first encountered'
    def __repr__(self):
        return '%s(%r)' % (self.__class__.__name__, OrderedDict(self))

    def __reduce__(self):
        return self.__class__, (OrderedDict(self),)
oc = OrderedCounter(['apple', 'banana', 'cherry', 'mango', 'apple', 'pie', 'mango'])
print(oc)

Output:

OrderedCounter(OrderedDict([('apple', 2), ('banana', 1), ('cherry', 1), ('mango', 2), ('pie', 1)]))
mhpd
  • 31
  • 5
0

I found this way of creating the ordered counter the easiest in python3. By casting the Counter to dict, print will use __repr__ method of dict which will make sure, order is maintained!

from collections import Counter
c = Counter(['apple', 'banana', 'cherry', 'mango', 'apple', 'pie', 'mango'])
OC = dict(c)
print(OC)

Output:

{'apple': 2, 'banana': 1, 'cherry': 1, 'mango': 2, 'pie': 1}