3

I'm trying to retrieve metadata information for a python package given the name of the module.

I can use importlib-metadata to retrieve the information, but in some cases the top-level module name is not the same as the package name.

example:

>>> importlib_metadata.metadata('zmq')['License']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "c:\Users\xxxxx\AppData\Local\Programs\Python\Python37\Lib\site-packages\importlib_metadata\__init__.py", line 499, in metadata
    return Distribution.from_name(distribution_name).metadata
  File "c:\Users\xxxxx\AppData\Local\Programs\Python\Python37\Lib\site-packages\importlib_metadata\__init__.py", line 187, in from_name
    raise PackageNotFoundError(name)
importlib_metadata.PackageNotFoundError: zmq


>>> importlib_metadata.metadata('pyzmq')['License']
'LGPL+BSD'
nsk
  • 145
  • 2
  • 12
  • Some ideas: https://stackoverflow.com/a/60363617/11138259 -- https://stackoverflow.com/a/60351412/11138259 – sinoroc Apr 01 '20 at 16:14
  • For others who come past here (via Google etc.) check out [this other discussion][1] for more ideas based on @sinoroc's suggestion. [1]: https://stackoverflow.com/questions/63847850/python3-pip-find-which-package-provides-a-particular-module/63887567#63887567 – sarlacii Sep 14 '20 at 15:45

2 Answers2

3

I believe something like the following should work:

#!/usr/bin/env python3

import importlib.util
import pathlib

import importlib_metadata

def get_distribution(file_name):
    result = None
    for distribution in importlib_metadata.distributions():
        try:
            relative = (
                pathlib.Path(file_name)
                .relative_to(distribution.locate_file(''))
            )
        except ValueError:
            pass
        else:
            if relative in distribution.files:
                result = distribution
    return result

def alpha():
    file_name = importlib.util.find_spec('easy_install').origin
    distribution = get_distribution(file_name)
    print("alpha", distribution.metadata['Name'])

def bravo():
    file_name = importlib_metadata.__file__
    distribution = get_distribution(file_name)
    print("bravo", distribution.metadata['Name'])

if __name__ == '__main__':
    alpha()
    bravo()

Update (February 2021):

Looks like this could become easier thanks to the newly added packages_distributions() function in importlib_metadata:

sinoroc
  • 18,409
  • 2
  • 39
  • 70
  • This assumes the path of the toplevel modules are subdirectories of the package. zmq and pyzmq-.dist-info are both subdirectories of site-packages. – nsk Apr 01 '20 at 18:20
  • @nsk I do not understand the issue. As a test I just installed `pyzmq` and the above code seems to work perfectly fine. – sinoroc Apr 01 '20 at 19:19
  • @nsk Did that answer your question or is there still some clarification needed? – sinoroc Apr 13 '20 at 12:08
  • Nice solution. The only thing I would add is there is a `is_relative_to` method that can slightly simplify things. – Noldorin Mar 08 '21 at 21:58
  • I spoke too soon: unfortunately, because so many packages are just eggs in e.g. /usr/local/lib/python3.9/site-packages/, this fails. – Noldorin Mar 08 '21 at 22:01
  • @Noldorin Ah right, if the code is installed in / imported from an archive (egg, zip file, etc.) then things might go wrong. Maybe there are ways to deal with this, I should look into this. – sinoroc Mar 09 '21 at 17:32
  • @Noldorin If you know the top level package (or module) then you can use this: https://importlib-metadata.readthedocs.io/en/stable/using.html#package-distributions – sinoroc Mar 09 '21 at 17:36
  • @sinoroc Yep, that's what I ended up doing, cheers. I also came up with a general solution though, which I've posted as an answer below. – Noldorin Mar 09 '21 at 17:51
0

Here is a function that does what you want, I believe. It's not terribly efficient, in that it has to enumerate all the installed package distributions and read the list of top-level modules for each — however, I believe it's the best one can do. (Of course, you can also cache a dict mapping from top-level module names to package names.)

from importlib.metadata import Distribution, distributions
from pathlib import Path
from typing import *

def get_pkg_distribution(top_level_module: str) -> Optional[Distribution]:
    pkg_path = Path(__file__).parent
    for dist in distributions():
        package_namespaces = (dist.read_text("top_level.txt") or "").splitlines()
        if top_level_module in package_namespaces:
            return dist
    return None

# Get the package metadata for the current package. Note, `__package__` is actually the name of the top-level module!
pkg_metadata = dict(get_pkg_distribution(__package__).metadata.items())
__version__ = pkg_metadata["Version"]
Noldorin
  • 144,213
  • 56
  • 264
  • 302
  • This solution seems to have the same shortcomings as mine. In that it relies on filenames. It would be better to rely on fully qualified module name, get the top level package/module name and then compare with https://importlib-metadata.readthedocs.io/en/stable/using.html#package-distributions – sinoroc Mar 09 '21 at 18:25
  • Actually, my solution is very similar to the implementation of `packages_distributions`, having just looked at the source code for that lib! I believe `__package` should get the full-qualified name of the top-level module? Not 100% sure though. Does the trick for me. – Noldorin Mar 09 '21 at 19:02
  • Does it work for code that is installed as egg, or something like that? – sinoroc Mar 09 '21 at 19:04
  • It seemed to when I last tried... but I can investigate properly later, maybe. – Noldorin Mar 09 '21 at 19:05
  • Yes, please let me know if you manage to test. I really wonder how to get a valid `__file__` out of an egg. – sinoroc Mar 09 '21 at 19:54
  • @sinoroc Oh, I see what you mean now... I didn't manage to test that. Have you tried `__package__` from within an egg/wheel though? Or if not that, then maybe the `inspect` module? – Noldorin Mar 09 '21 at 22:24