4

so this is a collection of questions that are more to clarify things and help better understand rather than an issue I am having.

I apologise now if I got things wrong or if these questions have been answered before. I wasn't able to find them.

First clarification I want to ask is:

Let us assume:

import scipy

First, I have noticed that you cannot in general access a module in a package by doing import package and then trying to access package.module.

For example scipy.io

You often have to do import package.module or even import astropy.io.fits, or you can do from package import module.

My question is why is this the case, and why is it so random -dependent on the package? I can't seem to identify any stable pattern. Is it due to the fact that some of these libraries (packages) are very big and in order to not have memory problems it only imports the core attributes/modules?

The second question:

It relates to actually checking the size of these packages. Is there any way to see how big they are when imported? Any way of knowing what will work and what won't other than trying it? I guess I could check with sys.modules and try to obtain it from there?

The third and final question:

In the scenario that I am not running my code on a Raspberry Pi and I don't necessarily have to worry about the memory issue (if that is the reason why they don't allow direct access), is there any way of actually importing package, such that it also loads all the sub packages?

I am just being lazy and wondering if it is possible. I am aware that it isn't good practice, but curiosity killed the cat.


Just to update and make it accessible to people to see related questions I have looked at:

This answer gives good advice on good general practice: What are good rules of thumb for Python imports?

Why can't I use the scipy.io? just like the documentation explains why the subpackage isn't necessarily imported

Then there is obviously the documentation: https://docs.python.org/3/reference/import.html#packages Section 5.2.1 is the reason why import scipy doesn't also import scipy.io, but I was wondering why would developers not make it an automated process.

This question is actually similar to part of my question but doesn`t seem to have a clear answer Python complex subpackage importing

Status of Questions:

Question 1: Good reason in answers

Question 2: Pending

Question 3: Pending

nzicher
  • 71
  • 8

3 Answers3

1

A package is represented by the file __init__.py. Therefore, the packge scipy is represented by scipy/__init__.py. Inside this file you see a lot of imports like this:

from scipy.version import version as __version__

This is the reason why scipy.__version__ works, even though __version__ actually lives in scipy.version. Not all packages do this. There is no rule when such kind of behavior can be expected. It is totally up to the package author(s).

Mike Müller
  • 82,630
  • 20
  • 166
  • 161
  • ok so essentially it really comes down to individual packages. Now according to https://docs.python.org/3/reference/import.html#packages section 5.2.1: ''Importing parent.one will implicitly execute parent/__init__.py and parent/one/__init__.py." From this, I would expect that in the case that I import astropy.io.fits, it would also import astropy. But I am not certain if this is true. (For ex.:I import matplotlib.pyplot but I still can't use matplotlib.use() --not that it would do anything)Also, I am still baffled as to why people do this and not automatically import the sub-packages as well. – nzicher Mar 08 '18 at 14:01
1

The key difference between these import calls is the namespace the module is imported into. Given the following example:

import mypackage
import mypackage.myclass
from mypackage import myclass

The first example imports everything exposed by __init__.py into the package's namespace. I.E. its elements can be accessed as mypackage.myclass(). The second example imports only mypackage.myclass and still imports it into that package's namespace, so it is still accessed as mypackage.myclass(). The third example imports mypackage.myclass into the current namespace, so it is accessed explicitly as myclass(), as if you had defined it yourself in the same script. This may hide things that you have named elsewhere.

One other important use case looks like this:

import mypackage as mp

This lets you set the namespace that you want that package to be imported into, perhaps making it a shorthand or something more convenient.

In the case of your question about why scipy doesn't import everything when you call import scipy, what it comes back to is that that import call only imports whatever the developers tell it to in the __init__.py. For scipy specifically, if you do:

import scipy
dir(scipy)

You will see that it imports a bunch of classes and functions that are used throughout the package. I suspect that they intentionally don't import the submodules so as not to litter your runtime space with things that you aren't using. Perhaps there is a way to import everything automatically, but you probably shouldn't.

Hal Jarrett
  • 825
  • 9
  • 19
  • Ok, the last paragraph of your answer is actually a good answer to my first question. I guess it is a good point that they intentionally only define the core functions and don't import submodules to not affect the runtime and make it as light as possible. As for your point regarding 'import mypackage as mp' I would advise against it, and only use it when there is a conflict. https://stackoverflow.com/questions/193919/what-are-good-rules-of-thumb-for-python-imports?rq=1 . This question/answer gives a nice elaboration as to why – nzicher Mar 08 '18 at 14:39
  • While the linked question makes a good point about importing classes with different names, in the form `from package import class as cls` or `import package.class as cls`, it is fairly standard practice to import an entire package with a shorter name, such as `import numpy as np` or `import matplotlib.pyplot as plt`, especially when you are typing out a bunch of references to that namespace – Hal Jarrett Mar 08 '18 at 16:17
1

Answer Q1

When you import a package, especially large ones like SciPy, it uses the init.py module intialisation module which prevents all subpackages/modules from being imported automatically to save space. I won't go into this further as this is already mentioned in this question, documented here, and talked about in other answers.

Additionally, if you have questions about scripts vs. modules, this post is incredibly descriptive.

Answer Q2

To find the size of a package I would point you towards this post about finding package directories, and then this post about reporting the size of a particular directory. You could create some combined code to do both for you.

Answer Q3

Update: Unsure on how to do this as the normal from package import * works as explained in the documentation (similar to Q1):

if a package’s __init__.py code defines a list named __all__, it is taken to be the list of module names that should be imported when from package import * is encountered

Nebbles
  • 113
  • 7
  • My third question is actually regarding 'import scipy' such that it also imports all the subpackages/modules, but I guess there is no easy way of doing that. thanks for clarifying all the others – nzicher Mar 19 '18 at 11:10
  • I genuinely thought that ``from scipy import *`` would work for that... thanks for the ✓ – Nebbles Mar 20 '18 at 19:07
  • it doesn't. It works the same way as import mypackage.myclass which doesn't do what I meant. I wanted to import mypackage such that it automatically imports myclass as well – nzicher Mar 21 '18 at 16:29
  • Yes, my apologies, I was too tired to understand the documentation properly. I have updated my post now to reflect this – Nebbles Mar 21 '18 at 17:42