0

I have a fork of an open-source Python module (CLTK) which is needed for a particular application. To make it easier on people running my code, I specifically modify the PATH before importing it, so that they can have standard CLTK for most applications and my forked CLTK for this one purpose.

import sys
sys.path.insert(1, '/path/to/fork/cltk/')
import cltk

In standard Python (either through the REPL or in a script), this works fine. It imports my forked version without any issues.

In Jupyter, though, I get the following exception:

/path/to/fork/cltk/__init__.py in <module>
     20 __url__ = 'http://cltk.org'
     21 
---> 22 __version__ = get_distribution('cltk').version  # pylint: disable=no-member
     23 
     24 if 'CLTK_DATA' in os.environ:

/usr/lib/python3/dist-packages/pkg_resources/__init__.py in get_distribution(dist)
    469         dist = Requirement.parse(dist)
    470     if isinstance(dist, Requirement):
--> 471         dist = get_provider(dist)
    472     if not isinstance(dist, Distribution):
    473         raise TypeError("Expected string, Requirement, or Distribution", dist)

/usr/lib/python3/dist-packages/pkg_resources/__init__.py in get_provider(moduleOrReq)
    345     """Return an IResourceProvider for the named module or requirement"""
    346     if isinstance(moduleOrReq, Requirement):
--> 347         return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
    348     try:
    349         module = sys.modules[moduleOrReq]

/usr/lib/python3/dist-packages/pkg_resources/__init__.py in require(self, *requirements)
    889         included, even if they were already activated in this working set.
    890         """
--> 891         needed = self.resolve(parse_requirements(requirements))
    892 
    893         for dist in needed:

/usr/lib/python3/dist-packages/pkg_resources/__init__.py in resolve(self, requirements, env, installer, replace_conflicting, extras)
    775                     if dist is None:
    776                         requirers = required_by.get(req, None)
--> 777                         raise DistributionNotFound(req, requirers)
    778                 to_activate.append(dist)
    779             if dist not in req:

DistributionNotFound: The 'cltk' distribution was not found and is required by the application

Now, since this fork is under my control, I can just comment out the offending line, or hardcode a version number, or otherwise remove the call to get_distribution.

But, I'd like to know why this happens specifically in Jupyter and not in standard Python.

What's happening differently in my Jupyter notebook that causes this to break, when it works just fine in the REPL or in a Python script file?

Draconis
  • 3,209
  • 1
  • 19
  • 31
  • Does this answer your question? [sys.path different in Jupyter and Python - how to import own modules in Jupyter?](https://stackoverflow.com/questions/34976803/sys-path-different-in-jupyter-and-python-how-to-import-own-modules-in-jupyter) – Romain Apr 09 '23 at 03:50
  • @Romain Unfortunately I don't think so. The issue doesn't seem to be with locating the module to import, but with pkg_resources not working properly once it's found. – Draconis Apr 09 '23 at 04:16
  • How about if you specify the path to the 'cltk' package in your fork directly in the code, rather than relying on get_distribution to find it. something like this:```import sys sys.path.insert(1, '/path/to/fork/cltk/') import cltk cltk.__version__ = '1.0.0' # Replace with the version of your fork ``` – Phoenix Apr 09 '23 at 04:40
  • @Phoenix That does fix the error, but I want to understand _why_ this is happening, since I suspect this difference could bite me again in the future (for something that's not so easy to modify). – Draconis Apr 09 '23 at 05:06

0 Answers0