-1

I am currently packaging my own module for distribution. In general everything is working fine, but fine-tuning/best-practice for structuring sub-modules is giving me some trouble.

Assuming a module structure of:

mdl
├── mdl
│   ├── __init__.py
│   ├── core.py
|   ├── sub_one
|   |   ├── __init__.py
|   |   └── core_sub_one.py
|   └── sub_two
|       ├── __init__.py
|       └── core_sub_two.py
├── README
└── setup.py

core file headers

With the header of core.py starting with:

import numpy as np

...some fairly large module code...

And the headers of both core_sub_one.py and core_sub_two.py starting with:

import numpy as np

from .. import core as cr

So all submodules require np and cr.


init.py structure

The mdl/__init__.py (core-level) looks like:

from . import sub_one as so
from . import sub_two as st

And __init__.py of both submodules looks like (replace one with two for the other submodule):

from . import core_sub_one
from .core_sub_one import *

I've "learnt" this structure from numpy, see f.i. numpy/ma/__init__.py


Problem description

Now I've got some trouble with the submodule-access after running setup.py and importing my module with import mdl.
I can now access my submodules with f.i. mdl.so.some_function_in_sub_one(). This is expected and what I want.

But I can also access the top level module cr and numpy with mdl.so.cr and mdl.so.np, which I want to avoid. Is there any way to avoid this? If not: Is there any drawback of importing/connecting modules and submodules like this?

And is there any best practice for how to import libraries like numpy in sub-modules, when they are required in all submodules?

Edit:
Since some seem to have trouble with the fact that asking for best practice is opinion based (which I know and which I intended, since imho most design decisions in real life are not clear binary 1-0 decisions), I have to add:
I want to comply with the module packaging style used in the scipy, and more specifically numpy, package environment. So if these packages found a solution for any of the questions I asked, this will be the most welcome solution for me.

JE_Muc
  • 5,403
  • 2
  • 26
  • 41
  • Why is the question downvoted and closed? It has a clear layout and problem description. Also the questions are quite straight-forward. And **of course** a question asking for "best practice" is opinion based. That's exactly what "best practice" implies... In most cases in real life there is just no I/O decision... Since SO often receives the feedback of not being *user friendly*: Exactly this downvoting/closing behaviour is the reason for it. Just let a more or less well posed question stand for itself and let the community "live" instead of suffocating it with frustrating behaviour. – JE_Muc Feb 20 '20 at 11:28

1 Answers1

1

First thing first:

from .core_sub_one import *

DONT DO THIS. Yes, even if you seen it in some "big name" package, read it in some tutorials or whatever. This is officially considered bad practice, and for good reasons (from experience, it's a maintaince hell).

If you really really insist on doing this (but seriously, don't), at least define an explicit __all__ var in those modules so you keep exposed names under control (and it helps documenting what's supposed to be part of the module's API).

But I can also access the top level module cr and numpy with mdl.so.cr and mdl.so.np, which I want to avoid. Is there any way to avoid this?

Not really. If you're really worried about it, you can import those names as "protected" in your submodules:

# core_sub_xxx.py

import numpy as _np
from .. import core as _cr

(of course you'll have to replace all occurrences of 'np' and 'cr' but any half-decent text editor can do this)

This doesn't prevent access to mysubmodule._cr or mysubmodule._np but at least it makes it clear that one should NOT access those names.

But really, this is not a big issue, as long as your API is clearly documented.

bruno desthuilliers
  • 75,974
  • 6
  • 88
  • 118
  • Many thanks for your help! Considering the first point: Yeah, I **never** use starred imports to avoid conflicts with duplicate method, class etc. names. But as you said: With numpy, it is in one of the "big(gest?) name" packages and so I thought, it may be ok in *this special* case. Thanks for clarification! Second point: For numpy it is imho reasonably acceptable to use `np`, but for `cr` I'll take your private import approach. Thansk again! Also I'll take a lookt at the `__all__` var. – JE_Muc Feb 20 '20 at 11:17