Loading same-named classes with importlib based on directory name: What are the consequences?

Question

Of course "no consequences except the obvious" is a valid answer to my question about the consequences.

I help maintain a code base, with some proposed code structured roughly like this:

# main.sh
export PYTHONPATH=$PYTHONPATH:$(pwd)
python main.py "brazil"
python main.py "france"

# main.py
from importlib.util import find_spec, module_from_spec
import sys

class BaseProcessor(object):
    # abstract base class
    pass

def run_for(country):
    spec = find_spec(country + ".processor")
    module = module_from_spec(spec)
    spec.loader.exec_module(module)
    processor = module.CountryProcessor()
    processor.do_something()

if __name__ == "__main__":
    run_for(sys.argv[1])

# brazil/processor.py
from main import BaseProcessor
class CountryProcessor(BaseProcessor):
    def do_something(self):
        print("This is the Brazil CountryProcessor")

# france/processor.py
from main import BaseProcessor
class CountryProcessor(BaseProcessor):
    def do_something(self):
        print("This is the France CountryProcessor")

The code in run_for() is using importlib to find a module named processor based on the passed-in string country.

Output is as designed:

> ./main.sh 
This is the Brazil CountryProcessor
This is the France CountryProcessor

We have some similar code (but using sys.path.insert()) that has already been running for months; I don't know of any situation where the same-named classes have caused problems.

What, if any, could be the unintended consequences arising from this design?

What exactly is your question? You show some code and state that it works as intended. So what is the part that is unclear? — a_guest, Feb 27 '21 at 20:12
well, for one thing you're iterating using separate processes so the risk of classes getting confused for each other is rather low. if you were running multiple countries at same time, I'd consider loading within a function and then assigning it to a namespace rather than global scope. Something like `someCountry.processor = module.CountryProcessor()` — JL Peyret, Feb 27 '21 at 22:52
a_guest: What are the consequences? Often code that runs has unexpected and possibly undesirable consequences in the long. My question is, what (if any) are those consequences? — Philip, Feb 28 '21 at 13:27
Although come to think of it, I forgot about Software Engineering Stack Exchange. This is fairly Python-specific, so I don't know if it fits better there or here. — Philip, Feb 28 '21 at 13:30

score 0 · Answer 1 · answered Feb 28 '21 at 18:02

Here I answer my own question, making a case for the following potential unintended consequences. I'm eager to hear other opinions on the real-world impact. It could just be me seeing monsters under the bed.

It's important to stipulate, this is code that we expect to be around a while, and the overall component is of course many times more complex than my toy example.

This approach produces debugging difficulties down the road.

A conventional (e.g. visual) debugger is going to display which class loaded. If all the classes are named the same, we're denying ourselves that debugging aid.
Thus, if some bug develops with the import, it will be harder to catch. We're in essence tricking a debugger.
Even a logging-based debug strategy is bug-prone. Now we have to maintain distinct logging statements in each class. Copy/pasting runs a risk of forgetting to change the debug.

Intuitively, it seems bug-prone:

There are numerous Q/As on here about how to mitigate accidental name collisions from two different libraries, generally by giving them different names at import-time. Why would we flirt with a situation (if we accidentally load two of these) that we otherwise would work to avoid?
This would cause problems if a naive editor tried to import statically within a method running after our importlib magic. I do see such static imports in our code from time to time. But I feel like there's an even clearer bug case I'm not yet able to articulate.

It just generally adds complexity hence cost to maintenance.

Cognitive load matters. "Simple is better than complex" (Tim Peters, "Zen of Python").
This import mechanism adds complexity without much compensating gain versus conventional imports and a factory. (I elided it in my toy example, but the import code is actually spread over 3-4 functions.) The supposed gain is not having to do 40 static imports if we have 40 countries, I suppose, but there are more intuitive ways around that.
This code could not support loading multiple countries in the same Python process. Supposedly right now that won't happen. But who's to say it couldn't down the road? Good software design doesn't try to predict the future, but stays flexible as long as possible.

In real life, I'd argue this approach reflects programming effort that could have been deployed to more durable solution. That opportunity cost may not matter now that the work is done. (For context, this code is proposed to replace some code that does use a factory and static imports, so the net effect IMO was to introduce an anti-pattern to a working mechanism.)

Proposed alternative

So how would I have handled this? If the number of imports is moderate (say, up to 40 or 50 ), I would just do static imports of all the child classes, with a simple factory class (or even just function). The factory could maintain a lookup table, or (better) it could infer the class name. Converting snake case to camel case is almost trivial.

If the number is larger, I still see no reason not to load all the classes. Just iterate once through the list of subdirs, convert them to camel case, and import them simply after converting to camel case for the class name.

The memory footprint of extra imports in Python is so trivial, that I see little downside to superfluous imports in the usual Python style. If for some reason we wanted to import more judiciously, we could keep the selective import in a factory class, but still distinguishing the names of different classes. Once again, it's almost trivial to construct names as long as we follow the pattern.

Loading same-named classes with importlib based on directory name: What are the consequences?

1 Answers1