4

I have a large python program that needs to be run in a new virtual env (on another machine). The program imports several external modules (which need to be installed first, in the new env).

For example one of my modules has the following imports:

import matplotlib
import os
from kivy.uix.label import Label
import my_file.my_module_5 as my_mod_5 

and another module has:

import my_module_7
import django

In this case I would need to create a list like this:

['matplotlib', 'kivy', 'django']

Notice that my own modules are not included as they are part of the program that will be migrated to the new env, and don't have to be installed. Neither are modules that are part of python like os.

I have created a function that finds all imported modules in my project and filters out those that belong to the project itself. However, it also returns standard python modules like os, sys etc.

def all_modules_in_project():
    """
    Finds all modules imported in the current working directory tree.

    :return: Set of module names.
    """

    project_directories = set()
    project_files = set()

    modules_imported = set()

    for path, dirs_names, files_names_in_dir in os.walk(os.getcwd()):
        project_directories |= set(dirs_names)

        for file_name in files_names_in_dir:
            if file_name.endswith('.py'):

                project_files.add(file_name[:-3])

                with open(path + '/' + file_name, 'r') as opened_file:
                    file_lines = opened_file.readlines()

                    for line in file_lines:

                        # import XXX
                        match = re.match(r'import\s([\w\.]+)', line)
                        if match:
                            modules_imported.add(match.groups()[0])

                        # from XXX
                        match = re.match(r'from\s([\w\.]+)', line)
                        if match:
                            modules_imported.add(match.groups()[0])

    # Removes XXX that were matched as follows `import proj_dir. .. .XXX`
    for module in modules_imported.copy():
        matched = re.match(r'(\w+)\.', module)
        if matched:
            pre_dot = matched.groups()[0]

            if pre_dot in project_directories:
                modules_imported.remove(module)

            else:
                # Replaces `xxx.yyy` with `xxx`
                modules_imported.remove(module)
                modules_imported.add(pre_dot)

    return modules_imported - project_files - project_directories

  • How can I filter out the standard python libraries that don't need to be installed?
  • Alternatively, is there a different easier way to determine which external libraries are used by my program?

(I don't need all installed packages; I need only those that are imported by the program)

user
  • 5,370
  • 8
  • 47
  • 75
  • 1
    `pip freeze > requirements.txt` – Bob Dylan Feb 18 '16 at 20:38
  • 1
    @BobDylan I need *only* the packages that I am using in my project. Your suggestion would include all installed packages. – user Feb 18 '16 at 20:40
  • So you want to ignore unused imports? – Josh J Feb 18 '16 at 21:23
  • 1
    @JoshJ I need only the imported external modules, non-imported should be ignored. Being used or not shouldn't matter (they are all used anyway). Which makes me realize that my edit made my question unclear. I apologize, i ll revert/improve the question. – user Feb 18 '16 at 21:28
  • https://bitbucket.org/blais/snakefood Example http://stackoverflow.com/a/2875570/1182891 – Josh J Feb 18 '16 at 21:31
  • You should use virtual environments. – Bob Dylan Feb 19 '16 at 01:56

1 Answers1

0

You can get a list of installed third party distributions within a script by using the get_installed_distributions method from pip:

from pip import get_installed_distributions

# Get a list of distribution objects (see pkg_resources documentation)
dist_list = get_installed_distributions()

# Get a list of distribution names as strings
dist_names = [dist.project_name for dist in dist_list]

You can then use that list in your function to determine whether a package is a standard lib thing or a third party thing, e.g.:

if pre_dot in dist_names:
    modules_imported.add(module)

For more information on distribution objects, check out the pkg_resources documentation

Edit: since you're in a venv, you'll have pip, but if you don't have pip, you can get a similar list as follows, using a package you know to be installed (I'm using setuptools in this example):

from pkg_resources import get_distribution, find_distributions

site_packages_dir = get_distribution('setuptools').location

third_party_packages = []

for dist in find_distributions(site_packages_dir):
    third_party_packages.append(dist.project_name)
MPlanchard
  • 1,798
  • 1
  • 16
  • 13