4

So, I was learning about Python modules and according to my understanding when we try import a module in our code, python looks if the module is present in sys.path and if it is not then a ModuleNotFoundError is raised.

Adding locations to sys.path

So, suppose I want to import from a location that does not exist in sys.path by default, I can simply append this new location to sys.path and everything works fine as shown in the snippet below.

~/Documents/python-modules/usemymodule.py

import sys
sys.path.append("/home/som/Documents/modules")

import mymodule
mymodule.yell("Hello World")

~/Documents/python-modules/modules/mymodule.py

def yell(txt):
    print(f"{txt.upper()}")

Clearing sys.path

My doubt is when I clear the entire sys.path list, then I should not be able to import any modules but to my surprise I can still import built-in modules. The code below works fine.

import sys
sys.path.clear()

import math
math.ceil(10.2)

I thought it could be possible that python internally doesn't use sys.path, sys.path is just a shallow copy of the original list that python uses, but then how does adding to sys.path works and why is it that after clearing I can only import built-in modules and not custom modules.

I am really stuck, any help would be really nice. Also, there's a similar question to this but it doesn't answer my doubts.

enter image description here

  • Wouldnt it be nice if the `built-in` things are built-in the python interpreter somehow? So no real source file is needed? not sure if that could be the case ... but my guess would be to look right where the python interpreter itself is located. See [how-can-i-find-the-location-of-the-source-code-of-a-built-in-python-method](https://stackoverflow.com/questions/34376558/how-can-i-find-the-location-of-the-source-code-of-a-built-in-python-method) – Patrick Artner Jun 05 '21 at 12:11
  • @PatrickArtner I didn't really get you. Are you telling me look where `math.py` is located, if that is the case then it is located at `/usr/lib/python3.8` which an entry in `sys.path` before clearing. –  Jun 05 '21 at 12:17
  • My best guess is that python has a location builtin into the binary, e.g. `/usr/lib/python$PYTHON_VERSION`, and uses that as a fallback when `sys.path` is empty – TheEagle Jun 05 '21 at 12:17
  • @python_user NO, absolutely not! –  Jun 05 '21 at 12:26
  • @Programmer So, what if I clear certain entries from `sys.path` rather than completely clearing it, will the interpreter look into `sys.path` or not? –  Jun 05 '21 at 12:27
  • @python_user I am not interested in the location of the module rather I am interested in how is the interpreter searching for these locations. I hope you're getting me. –  Jun 05 '21 at 12:29
  • @Prakhar Okay! That kinda makes sense, but then why is the python binary location present in `sys.path` when it is never going to use that for built-in modules. –  Jun 05 '21 at 12:34
  • What platform are you on? `"/home/som/Documents/modules"` looks like a Unix path, but `math` is only built-in on Windows. – user2357112 Jun 05 '21 at 12:57
  • If you're on Unix and the `math` import still worked, that probably means `math` was already loaded before you cleared `sys.path`. – user2357112 Jun 05 '21 at 12:59
  • @user2357112supportsMonica I am on linux but I am using WSL, also added an image for reference. –  Jun 05 '21 at 13:08

2 Answers2

1

CPython has a list of built-in modules like math that is defined in file PC/config.c and looks like this:

struct _inittab _PyImport_Inittab[] = {

    {"_abc", PyInit__abc},
    {"array", PyInit_array},
    {"_ast", PyInit__ast},
    {"audioop", PyInit_audioop},
    {"binascii", PyInit_binascii},
    {"cmath", PyInit_cmath},
    ...
};

So when it needs to import a built in module it looks inside this list instead. Each of the "PyInit" functions in the list returns an in-memory module object.

This list is then exposed as sys.builtin_module_names, which is initialized in sysmodule.c. Then, the import code in importlib._bootstrap._find_spec is called and goes over a list of import factories in sys.meta_path. One of them is importlib._bootstrap.BuiltinImporter, which is responsible for importing built-in modules. This demonstrates sys.meta_path:

>>> import sys
>>> sys.modules['math']
<module 'math' (built-in)>
>>> sys.path.clear()
>>> import math  # This works because math is in the module cache.
>>> del sys.modules['math']
>>> import math  # This works because of BuiltinImporter in sys.meta_path!
>>> sys.meta_path.clear()
>>> import math  # This still works because math is in the module cache.
>>> del sys.modules['math']
>>> import math  # This fails because we cleared sys.meta_path!
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'math'

This was run on Python3.7 with Anaconda - may vary under different distributions.

I want to add that your test doesn't account for the module cache in sys.modules. Consider this example with a non-builtin module:

>>> import requests
>>> import sys
>>> sys.path.clear()
>>> import requests  # This works!
>>> del sys.modules['requests']
>>> import requests  # This doesn't.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'requests'
unddoch
  • 5,790
  • 1
  • 24
  • 37
  • Okay! So, If I understood you clearly, if I try to `import requests` after cleating `sys.path` it will raise ModuleNotFoundError. –  Jun 05 '21 at 12:50
  • Yes, but only if you also remove it from the cache first. – unddoch Jun 05 '21 at 12:51
  • That's strange because I just tried it without clearing the cache and it stills raises an exception. All I did was: `import sys; sys.path.clear(); import requests` –  Jun 05 '21 at 12:53
  • 1
    Note that `PC/config.c` is where that list is defined for a Windows build. On other platforms, the list is in the `config.c` generated from [`Modules/config.c.in`](https://github.com/python/cpython/blob/main/Modules/config.c.in). – user2357112 Jun 05 '21 at 12:55
  • @SomShekharMukherjee I updated my answer with details about `sys.meta_path`. – unddoch Jun 05 '21 at 19:44
1

I tried to reproduce your example and to my surprise did not have the same result (note: python3.9 here)

import sys
sys.path.clear()

import math
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'math'

However, this works:

import math
del math

import sys
sys.path.clear()

import math

# but removing the reference in sys.modules will break the import again
del sys.modules['math']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'math'

My guess is that the interpreter is keeping a reference to the math module from a previous import, and thus has no need to search for it in sys.path

Romuald Brunet
  • 5,595
  • 4
  • 38
  • 34