I'm having an issue managing imports with a big software repo that we have. For sake of clarity, let's pretend the repo looks something like this:
repo/
__init__.py
utils/
__init__.py
math.py
readers.py
...
...
Now our __init__.py
files are setup so that we can do something like this
from repo.utils import IniReader
In this example repo/utils/__init__.py
would have
from .readers import IniReader, DatReader
This structure has worked out well for us from a readability standpoint, but we are now facing issues when trying to deploy applications.
The issue is this... let's pretend I'm writing an app that looks like this:
from repo.utils import IniReader
if __name__ == '__main__':
r = IniReader('blah.ini')
print(r.fields)
Now the from repo.utils import IniReader
will execute repo/utils/__init__.py
which in this case will import IniReader
and DatReader
. Let's pretend that DatReader
looks something like this:
import numpy as np
import scipy
import tensorflow
from .math import transform
class DatReader():
...
which adheres to PEP8, with all the imports at the top of the file.
The problem here is that DatReader
requires some heavyweight imports (e.g. numpy, scipy, tensorflow are huge libraries). To make matters worse, the from .math import transform
might have something like from repo.contrib import lookup
which then hits the repo/contrib/__init__.py
which starts a chain reaction and ends up importing our entire repository.
This really hasn't been a problem for all of us developers with a full development environment stood up, but now that we're trying to ship applications (internally) this import hell is becoming an issue.
Is there a standard solution to this problem? We've talked about just keeping the __init__.py
empty, or just not having all the imports at the top of a file as PEP8 states. Both of these solutions come with compromises, so if anyone has suggestions or references, I'd love to hear it.
Thanks!