1

Let's say I want to create a registry of subclasses of a certain class. Now there are two approaches I can think of and while I'm aware of (some of) their differences, I'd love to learn more about the topic.

class Base:
    pass

class DerivedA(Base):
    pass

class DerivedB(Base):
    pass

__subclasses__()

If I have the situation above, I can simply get the list of subclasses of Base like this:

>>> [cmd.__name__ for cmd in Base.__subclasses__()]
['DerivedA', 'DerivedB']

Now I'm aware that if I add a third class that is not directly subclassing Base like this:

class DerivedC(DerivedA):
    pass

I will not see this one in the list:

>>> [cmd.__name__ for cmd in Base.__subclasses__()]
['DerivedA', 'DerivedB']

Also I can't filter the subclasses and for example ignore a particular subclass for any reason.

__init_subclass__()

Since Python 3.6 there is a nice hook into class creating process and more advanced things can be done without writing one's own metaclass. Thus I can also do something like this...

_registry = []

class Base:

    def __init_subclass__(cls, **kwargs):
        super().__init_subclass__(**kwargs)
        _registry.append(cls.__name__)

class DerivedA(Base):
    pass

class DerivedB(Base):
    pass

class DerivedC(DerivedA):
    pass

And then simply access _registry:

>>> _registry
['DerivedA', 'DerivedB', 'DerivedC']

I can also modify Base to ignore certain subclasses if I wanted:

_registry = []

class Base:

    def __init_subclass__(cls, **kwargs):
        super().__init_subclass__(**kwargs)
        if cls.__name__ != 'DerivedB':
            _registry.append(cls.__name__)

class DerivedA(Base):
    pass

class DerivedB(Base):
    pass

class DerivedC(DerivedA):
    pass
>>> _registry
['DerivedA', 'DerivedC']

Why use the latter?

Now let's say that I don't want to filter the subclasses and I'm only interested in direct subclasses. The former approach seems to be simpler (subjective, I know). What are other differences and maybe what are the advantages of the latter approach?

Thanks!

geckon
  • 8,316
  • 4
  • 35
  • 59
  • 2
    I'm not sure if this is a question we can effectively answer here on Stack Overflow. "What's better" is pretty subjective, and you've already described most of the differences between the classes in your question, so pretty much venting our opinions is all we could do in an answer. Perhaps if you spelled out *why* you want a list of class names, it would be easier for outside people to understand what you're objective is, and how these two possible solutions would get you towards it, but without that we're just guessing, or arguing over colors to paint the bike shed. – Blckknght Nov 14 '19 at 19:09
  • What do you want to compare, the two approaches to defining `_registry`, or either `_registry` approach to `__subclasses__`? Just because the newer `__init_subclasses__` can be used to re-implement `__subclasses__` doesn't mean there's any benefit to *doing* so. – chepner Nov 14 '19 at 19:10
  • Maybe dupe: [How to find all the subclasses of a class given its name?](https://stackoverflow.com/q/3862310/674039) (both options mentioned here are already discussed in detail there). Besides, it sounds like you understand both approaches pretty well, so not sure what kind of answer you're looking for here. Note the `__subclasses__` can be made recursive pretty trivially [like this](http://dpaste.com/18EDJ26). – wim Nov 14 '19 at 19:11
  • You guys are right, I edited the question a bit. "You understand and described both concepts and the differences pretty well, there is no other significant difference. Just pick whatever feels better." Would be a good answer for me. I wanted to know if I'm overlooking something. Thanks! – geckon Nov 14 '19 at 19:35

1 Answers1

2

The obvious gain of writing __init_subclass__ in a base class in this case is that you can automatically get to the subclasses that do not inherit directly from your base class, as you put it.

If you only need the classes that inherit directly from your base, then it is ready in the __subclasses__ method, and the major advantage is that you don't need to write a single line of code, and not even keep a separate registry, as the __subclasses__ will do that for you.

However, unless you are writing a relatively small app, or are dealing with a feature that just needs a small fixed number of these subclasses to be looked-up, relying in __subclasses__ is not enough - if you simply need, or want, another level of classes in your hierarchy, it will stop working, and you have to resort to a true registry anyway.

Prior to having the __init_subclass__ hook, one would have to write a proper metaclass to keep this registry, feeding it on the metaclass __init__ method, or do a complicated recursive query like:

def check_subclass(base, candidate):
   for cls in base.__subclasses__():
       if cls is candidate:
          return True
       if check_subclass(cls, candidate):
          return True
   return False

And, although it should go without saying, the __init_subclass__ method can do a lot more than simply keep a registry - as it can run any code. It could check against the DB layer if the fields mapped to that subclass are up to date, and warn of a needed migration - or even perform the DB migration itself, or initialise any resources that instances of the class will need to find ready when they are created, such as logger-handlers, thread-pools, db-connection pools, you name it.

TL;DR: If you just need the direct subclasses of a class, go with __subclasses__. The catch is exactly that it just annotates the direct subclasses.

jsbueno
  • 99,910
  • 10
  • 151
  • 209