3

For a multitude of reasons I find myself in the position of importing many python modules and wanting to iterate through each of the Classes in the module.

from capacity_hdd_parser import CapacityHDDParser
from capacity_ssd_parser import CapacitySSDParser
from checksum_parser import ChecksumParser
.
.
.

each parser inheritances from a base class and has a method I want to call on each parser

parsers = [CapacityHDDParser, CapacitySSDParser, ChecksumParser]
for parser in parsers:
    parser_instance = parser()
    data_returned = parser_instance.parse(logset_path)
    # Do a bunch of post processing here.

My problem is that I have many parsers to go through and I feel like there has to be a way to dynamically iterate through imported class. Having to hand write each of these is not only a pain in the ass it makes the intent of my code much harder to see in the noise.

AlexLordThorsen
  • 8,057
  • 5
  • 48
  • 103
  • hack: `for Parser in base_class.__subclasses__()` – jfs Oct 03 '14 at 20:33
  • 1
    If you can't find a completely different method of accomplishing your overall goal, I'd suggest you stick with your current method. It seems tedious and error-prone, but meta programming and manipulating variables and identifiers as if they were data can be even more risky. Your current approach seems simple and readable, so you might just stick with it. This seems like an [XY Problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) though. – skrrgwasme Oct 03 '14 at 20:33
  • @SLawson My ultimate problem is that the python startup time is larger than the time to run the parsers but that kind of discussion is (I believe) beyond the scope of a SO question. – AlexLordThorsen Oct 03 '14 at 20:37
  • Agreed. It sounds like you have working code, so if you don't get the kind of answer you're looking for here, consider taking it over to [Code Review](http://codereview.stackexchange.com/). – skrrgwasme Oct 03 '14 at 20:40
  • Assuming they all end with "Parser", you could just do `parsers = [v for k,v in globals().items() if k.endswith("Parser")]` -- not saying it's pretty – a p Oct 03 '14 at 21:01
  • @ap: which is exactly my answer from 22 minutes ago. Am I hell-banned or what? :-) – Paulo Scardine Oct 03 '14 at 21:03
  • @PauloScardine na I just didn't see it, it was the last answer and it was below the scroll, my bad. I'll toss you a vote ;) – a p Oct 03 '14 at 21:05
  • 3
    This is not a dupe. I think the OP does not want to know how to list modules, he just want a DRY way to import and iterate over a bunch of classes. – Paulo Scardine Oct 03 '14 at 23:00

3 Answers3

4

If you don't need them in the global namespace, you could use importlib.import_module.

from importlib import import_module

for module_name, class_name in (('capacity_hdd_parser', 'CapacityHDDParser'),
                                ('capacity_ssd_parser', 'CapacitySSDParser'),
                                ('checksum_parser', 'ChecksumParser')):
    data_returned = getattr(import_module(module_name), class_name)().parse(logset_path)
    # Other processing here

You might also want to consider consolidating your parser classes in to a single package. It would make this approach more DRY, and also probably be more Pythonic. One class per file is usually overly redundant/verbose in Python.

djv
  • 15,168
  • 7
  • 48
  • 72
Silas Ray
  • 25,682
  • 5
  • 48
  • 63
3

Kids, do not try to this at home:

parsers = [v for (k, v) in locals().items() 
             if k.endswith('Parser')]

You may make it a little bit safer with a better test condition.

[update]

The declarative approach by Silas is the safe bet:

PARSERS = {
    'capacity_hdd_parser': 'CapacityHDDParser',
    'capacity_ssd_parser': 'CapacitySSDParser',
    'checksum_parser': 'ChecksumParser',
    ...
}

def load_parser(module, parser):
    return getattr(importlib.import_module(module), parser)

parsers = [load_parser(*item) for item in PARSERS.items()]

Better yet, you can replace the PARSERS dict with a config file.

Paulo Scardine
  • 73,447
  • 11
  • 124
  • 153
  • 2
    I think your warning has it backwards: This is perfectly fine to try at home. Don't do it in a corporate environment. – skrrgwasme Oct 03 '14 at 20:41
  • So this does return all of the classes I'm interested in. What is the problem with this answer? – AlexLordThorsen Oct 03 '14 at 20:45
  • 1
    The problem is if anybody else introduces a variable ending with "Parser" in the namespace, you may have unexpected results. So it is a hidden trap in a million LoC corporate application but for ad-hoc throw-away code, I tend to indulge myself. – Paulo Scardine Oct 03 '14 at 20:48
  • Or - perhaps more nefariously, since the failure is silent - if one of the parsers' names is changed to "ParserOfXXXX" or anything not matching the pattern, you just lose behaviour with no way of knowing. – a p Oct 03 '14 at 21:06
0
for Parser in get_registered_parsers():
    data = Parser().parse(logset_path)

Define get_registered_parsers() by any means necessary including black magic e.g., setuptools entry_points, or yapsy (plugin architecture), or ABCs (explicit register() function), etc.

jfs
  • 399,953
  • 195
  • 994
  • 1,670