One way of doing this is to walk through your directories and programmactically import the modules you need.
Assuming that the Scraper X folder
s are in the same subdirectory scrapers
and you have the batch_run.py
script in the directory containing scrapers
(hence, at the same path level), the following script will do the trick:
import os
import importlib
base_subdir = 'scrapers'
for root, subdirs, filenames in os.walk(base_subdir):
for subdir in subdirs:
if not subdir.startswith('__'):
print(root, subdir)
submodule = importlib.import_module('.'.join((root, subdir, 'scraper')))
submodule.main()
EDIT
If the script is inside the base_subdir
path, the code can be adapted by changing a bit how the import_module()
is called.
import os
import importlib
base_subdir = '.'
for root, subdirs, filenames in os.walk(base_subdir):
for subdir in subdirs:
if not subdir.startswith('__'):
print(root, subdir)
script = importlib.import_module('.'.join((subdir, 'scraper')), root)
script.main()
EDIT 2
Some explanations:
How import_module()
is being used?
The import_module()
line, is what is actually doing the job. Roughly speaking, when it is used with only one argument, i.e.
alias = importlib.import_module("my_module.my_submodule")
it is equivalent to:
import my_module.my_submodule as alias
Instead, when used with two argumens, i.e.
alias = importlib.import_module("my_submodule", "my_module")
it is equivalent to:
from my_module import my_submodule as alias
This second form is very convenient for relative imports (i.e. imports using .
or ..
special directories).
What is if not subdir.startswith('__'):
doing?
When you import a module, Python will generate some bytecode to be interpreted and it will cache the result as .pyc
files under the __cache__
directory. The aforementioned line will avoid that, when walking through the directories, __cache__
(actually, any directory starting with __
) will be processed as if it would contain modules to import. Other kind of filtering may be equally valid.