84

What is the most efficient way to list all dependencies required to deploy a working project elsewhere (on a different OS, say)?

Python 2.7, Windows dev environment, not using a virtualenv per project, but a global dev environment, installing libraries as needed, happily hopping from one project to the next.

I've kept track of most (not sure all) libraries I had to install for a given project. I have not kept track of any sub-dependencies that came auto-installed with them. Doing pip freeze lists both, plus all the other libraries that were ever installed.

Is there a way to list what you need to install, no more, no less, to deploy the project?

EDIT In view of the answers below, some clarification. My project consists of a bunch of modules (that I wrote), each with a bunch of imports. Should I just copy-paste all the imports from all modules into a single file, sort eliminating duplicates, and throw out all from the standard library (and how do I know they are)? Or is there a better way? That's the question.

RolfBly
  • 3,612
  • 5
  • 32
  • 46

10 Answers10

161

pipreqs solves the problem. It generates project-level requirement.txt file.

Install pipreqs: pip install pipreqs

  1. Generate project-level requirement.txt file: pipreqs /path/to/your/project/
  2. requirements file would be saved in /path/to/your/project/requirements.txt

If you want to read more advantages of pipreqs over pip freeze, read it from here

Haifeng Zhang
  • 30,077
  • 19
  • 81
  • 125
  • @wim Thanks for your comment. For production, I agree `setup.py` is a better choice. for dev purpose, I recommend using `requirement.txt` to pin the pkg versions for the dev environment – Haifeng Zhang Feb 14 '17 at 22:34
  • 4
    The question asks **list all dependencies required to deploy a working project**. Later it asked "Is there a way to list what you need to install, **no more**, no less, to deploy the project?". That's clearly asking about production, and not about development! I also recommend using `requirement.txt` to pin the pkg versions for the dev environment - but it's not what the question is about. – wim Feb 14 '17 at 22:37
  • 3
    I guess this is what you're referring to: https://packaging.python.org/requirements/ – nighthawk454 Feb 15 '17 at 01:12
  • 1
    @nighthawk454 Yes, exactly – wim Feb 15 '17 at 18:56
  • 2
    `pipreqs` does not work for me. I have a script that imports pdfminer (`from pdfminer.pdfpage import PDFPage`) yet requirement,.txt remains empty; pdfminer should be listed. – Hans Deragon Oct 14 '18 at 22:34
  • i'm working with django deployed on heroku, and tried creating a requirements.txt using pipreqs, and encountered some errors. from what i've learned, pipreqs gathers the dependencies that were imported in the python scripts within a django project dir, but not considering the applications stated in settings.INSTALLED_APPS (i.e. django crispy forms). – Ogs Apr 20 '19 at 09:54
  • The problem is that it fails on python 2 files... (Installed it with Python 3) – PlasmaBinturong Jan 06 '20 at 17:21
  • @R.Karlus what error did you get, I use it in my python2.7 and python3.5 projects – Haifeng Zhang Jun 17 '20 at 22:16
  • Unfortunately, pipreqs has some "stupid" bugs, like for mysql-connector or hydra – Qiulang Jun 17 '21 at 06:40
  • `pipreqs` will miss packages actually used by the project (e.g. gunicorn), and add packages that don't make sense because they're not used and do not even exist (e.g. `en_core_web_sm==3.5.0`, `python_bcrypt==0.3.2`.) – Fabien Snauwaert Feb 11 '23 at 12:34
32

Scan your import statements. Chances are you only import things you explicitly wanted to import, and not the dependencies.

Make a list like the one pip freeze does, then create and activate a virtualenv.

Do pip install -r your_list, and try to run your code in that virtualenv. Heed any ImportError exceptions, match them to packages, and add to your list. Repeat until your code runs without problems.

Now you have a list to feed to pip install on your deployment site.

This is extremely manual, but requires no external tools, and forces you to make sure that your code runs. (Running your test suite as a check is great but not sufficient.)

9000
  • 39,899
  • 9
  • 66
  • 104
  • This answer is correct, and is nice in that it doesn't depend on any tools. Check out some of the other answers for ways to scan your imports automatically. – nighthawk454 Feb 14 '17 at 22:16
  • @nighthawk454: Thank you. I've upvoted these other answers. – 9000 Feb 14 '17 at 22:17
  • 1
    With currently so many tools and none that do a good job, this might actually end up being the fastest and easiest answer. – Fabien Snauwaert Feb 11 '23 at 12:37
13

On your terminal type:

pip install pipdeptree
cd <your project root>
pipdeptree
Luca
  • 1,610
  • 1
  • 19
  • 30
Venu Gopal Tewari
  • 5,672
  • 42
  • 41
  • `pipdeptree` is a powerful tool that does what you are saying. In its [README](https://github.com/naiquevin/pipdeptree#using-pipdeptree-to-write-requirementstxt-file) it tells you how to get just the higher level dependencies. Steps: 1) Create a virtual environment. 2) Attempt to run the program. Install the dependencies it expects as it errors. 3) Then use `pipdeptree` command in the linked Readme to get the minimum requirements. – Rebecca Sich Jul 27 '20 at 23:11
  • I get the same output no matter what folder i'm in. this makes me think it does not consider the modules but is just reading what's installed in pip. – JDPeckham Aug 27 '20 at 19:33
  • It respects the deepest level of directory to check for the dependencies. If you are out of an venv then it will take from global installation. – Venu Gopal Tewari Aug 28 '20 at 12:57
3

You can simply use pipreqs, install it using:

pip install pipreqs

Then, type: pipreqs . on the files directory. A text file named requirements will be created for you, which looks like this:

numpy==1.21.1
pytest==6.2.4
matplotlib==3.4.2
PySide2==5.15.2
Mostafa Wael
  • 2,750
  • 1
  • 21
  • 23
3

You could use the findpydeps module I wrote:

  • Install it via pip: pip install findpydeps
  • If you have a main file: findpydeps -l -i path/to/main.py (the -l will follow the imports in the file)
  • Or your code is in a folder: findpydeps -i path/to/folder
  • Most importantly, the output is pip-friendly:
    • do findpydeps -i . > requirements.txt (assuming . is your project's directory)
    • then pip install -r requirements.txt

You can of course search through multiple directories and files for requirements, like: findpydeps -i path/to/file1.py path/to/folder path/to/file2.py, etc.

By default, it will remove the packages that are in the python standard library, as well as local packages. Refer to the -r/--removal-policy argument for more info.

If you don't want imports that are done in if, try/except or with blocks, just add --no-blocks. The same goes for functions with --no-functions.

Anyway, you got the idea: there are a lot of options (most of them are not discussed here). Refer the findpydeps -h output for more help!

  • Top tool, from what I can tell other pip modules don't list imported python files. This is exactly what I needed. – watbywbarif Apr 12 '22 at 08:07
2

I found the answers here didn't work too well for me as I only wanted the imports from inside our repository (eg. import requests I don't need, but from my.module.x import y I do need).

I noticed that PyInstaller had perfectly good functionality for this though. I did a bit of digging and managed to find their dependency graph code, then just created a function to do what I wanted with a bit of trial and error. I made a gist here since I'll likely need it again in the future, but here is the code:

import os

from PyInstaller.depend.analysis import initialize_modgraph


def get_import_dependencies(*scripts):
    """Get a list of all imports required.
    Args: script filenames.
    Returns: list of imports
    """
    script_nodes = []
    scripts = set(map(os.path.abspath, scripts))

    # Process the scripts and build the map of imports
    graph = initialize_modgraph()
    for script in scripts:
        graph.run_script(script)
    for node in graph.nodes():
        if node.filename in scripts:
            script_nodes.append(node)

    # Search the imports to find what is in use
    dependency_nodes = set()
    def search_dependencies(node):
        for reference in graph.getReferences(node):
            if reference not in dependency_nodes:
                dependency_nodes.add(reference)
                search_dependencies(reference)
    for script_node in script_nodes:
        search_dependencies(script_node)

    return list(sorted(dependency_nodes))


if __name__ == '__main__':
    # Show the PyInstaller imports used in this file
    for node in get_import_dependencies(__file__):
        if node.identifier.split('.')[0] == 'PyInstaller':
            print(node)

All the node types are defined in PyInstaller.lib.modulegraph.modulegraph, such as SourceModule, MissingModule, Package and BuiltinModule. These will come in useful when performing checks.

Each of these has an identifier (path.to.my.module), and depending on the node type, it may have a filename (C:/path/to/my/module/__init__.py), and packagepath (['C:/path/to/my/module']).

I can't really post any extra code as it is quite specific to our setup with using pyarmor with PyInstaller, I can happily say it works flawlessly so far though.

Peter
  • 3,186
  • 3
  • 26
  • 59
1

The way to do this is analyze your imports. To automate that, check out Snakefood. Then you can make a requirements.txt file and get on your way to using virtualenv.

The following will list the dependencies, excluding modules from the standard library:

sfood -fuq package.py | sfood-filter-stdlib | sfood-target-files 

Related questions:

Get a list of python packages used by a Django Project

list python package dependencies without loading them?

Community
  • 1
  • 1
nighthawk454
  • 943
  • 12
  • 20
  • 1
    Snakefood just generates dependency graphs. How do you propose to exclude the test dependencies and the dependencies-of-dependencies? – wim Feb 14 '17 at 23:01
  • 4
    This does not work on Python3 and the author notes in the README that porting it does require quite some work. It still uses the old compile/compile.ast modules which have been removed on Python3. Analysing Python3 packages fails for me. – TheDiveO Aug 28 '19 at 12:32
1

If you're using an Anaconda virtual environment, you can run the below command inside the environment to create a txt file of all the dependencies used in the project.

conda list -e > requirements.txt
oil_lamp
  • 482
  • 7
  • 9
0

I would just run something like this:

import importlib
import os
import pathlib
import re
import sys, chardet
from sty import fg

sys.setrecursionlimit(100000000)

dependenciesPaths = list()
dependenciesNames = list()
paths = sys.path
red = fg(255, 0, 0)
green = fg(0, 200, 0)
end = fg.rs


def main(path):
    try:
        print("Finding imports in '" + path + "':")

        file = open(path)
        contents = file.read()
        wordArray = re.split(" |\n", contents)

        currentList = list()
        nextPaths = list()
        skipWord = -1

        for wordNumb in range(len(wordArray)):
            word = wordArray[wordNumb]

            if wordNumb == skipWord:
                continue

            elif word == "from":
                currentList.append(wordArray[wordNumb + 1])
                skipWord = wordNumb + 2

            elif word == "import":
                currentList.append(wordArray[wordNumb + 1])

        currentList = set(currentList)
        for i in currentList:
            print(i)

        print("Found imports in '" + path + "'")
        print("Finding paths for imports in '" + path + "':")

        currentList2 = currentList.copy()
        currentList = list()

        for i in currentList2:
            if i in dependenciesNames:
                print(i, "already found")

            else:
                dependenciesNames.append(i)

                try:
                    fileInfo = importlib.machinery.PathFinder().find_spec(i)
                    print(fileInfo.origin)

                    dependenciesPaths.append(fileInfo.origin)

                    currentList.append(fileInfo.origin)

                except AttributeError as e:
                    print(e)
                    print(i)
                    print(importlib.machinery.PathFinder().find_spec(i))
                    # print(red, "Odd noneType import called ", i, " in path ", path, end, sep='')


        print("Found paths for imports in '" + path + "'")


        for fileInfo in currentList:
            main(fileInfo)

    except Exception as e:
        print(e)


if __name__ == "__main__":
    # args
    args = sys.argv
    print(args)

    if len(args) == 2:
        p = args[1]

    elif len(args) == 3:
        p = args[1]

        open(args[2], "a").close()
        sys.stdout = open(args[2], "w")

    else:
        print('Usage')
        print('PyDependencies <InputFile>')
        print('PyDependencies <InputFile> <OutputFile')

        sys.exit(2)

    if not os.path.exists(p):
        print(red, "Path '" + p + "' is not a real path", end, sep='')

    elif os.path.isdir(p):
        print(red, "Path '" + p + "' is a directory, not a file", end, sep='')

    elif "".join(pathlib.Path(p).suffixes) != ".py":
        print(red, "Path '" + p + "' is not a python file", end, sep='')

    else:
        print(green, "Path '" + p + "' is a valid python file", end, sep='')

        main(p)

    deps = set(dependenciesNames)

    print(deps)

    sys.exit()
-1

This answer is to help someone list all dependencies with versions from the Python script itself. This will list all dependencies in the user virtual environment.

from pip._internal.operations import freeze

x = freeze.freeze()
for dependency in x:
   print(dependency)

for this you need to install pip as a dependency. Use the following command to install pip dependency.

pip install pip

The print output would look like the following.

certifi==2020.12.5
chardet==4.0.0
idna==2.10
numpy==1.20.3
oauthlib==3.1.0
pandas==1.2.4
pip==21.1.2
python-dateutil==2.8.1
pytz==2021.1
requests==2.25.1
requests-oauthlib==1.3.0
setuptools==41.2.0
six==1.16.0
urllib3==1.26.4
Keet Sugathadasa
  • 11,595
  • 6
  • 65
  • 80