30

I am trying to generate requirements.txt for someone to replicate my environment. As you may know, the standard way is

pip freeze > requirements.txt

I noticed that this will list all the packages, including the dependencies of installed packages, which makes this list unnecessary huge. I then browsed around and came across pip-chill that allows us to only list installed packages in requirements.txt.

Now, from my understanding when someone tries to replicate the environment with pip install -r requirements.txt, this will automatically install the dependencies of installed packages.

If this is true, this means it is safe to use pip-chill instead of pip to generate the requirements.txt. My question is, is there any other risk of omitting dependencies of installed packages using pip-chill that I am missing here?

sinoroc
  • 18,409
  • 2
  • 39
  • 70
Darren Christopher
  • 3,893
  • 4
  • 20
  • 37
  • 2
    I dont know pip chill, but you have to be careful with dependencies of dependencies that don't match your required version. E.g. you use numpy 1.5 and pandas 1.0 (and that version of pandas asks for numpy 1.4) – alan.elkin May 01 '20 at 04:00
  • 1
    The point of freezing *all* package versions is so you know you’ll be running with known good versions. Otherwise version incompatibility bugs can creep in. – deceze May 01 '20 at 04:01
  • 1
    @deceze well, if you installed a package in a specific version, it'll know the good dependencies version for that version too (from requirements of that package in that version), right? So I assume recording the dependencies version is not necessary? – Darren Christopher May 01 '20 at 04:04
  • 2
    Until the dependency releases a new breaking version which breaks the package until the package releases a new version that is more strict about its dependency. That’s not unheard of. – deceze May 01 '20 at 04:07
  • 1
    Hmm yeah, I agree that this could happen if the installed packages doesn't record the exact dependencies version. So, you reckon I better of with standard `pip freeze > requirements.txt` then? – Darren Christopher May 01 '20 at 04:09
  • 1
    If your only reason against `freeze` is that the list is massive… so what, why’s that really important? – deceze May 01 '20 at 04:14
  • 1
    Not that important, but I'm just thinking if someone would like to know the direct dependencies of my package. Using `freeze` means that he has to check every import statements to do so and find unique imported libraries. – Darren Christopher May 01 '20 at 04:18
  • 2
    You can set up and maintain a `setup.py` file with your direct dependencies and a `requirements.txt` with your known good frozen dependencies. That’s basically how every major dependency manager does it, one direct list and one “lock” file. – deceze May 01 '20 at 04:24
  • Look at https://python-poetry.org/ for an excellent pip replacement. – deceze May 01 '20 at 04:28
  • Somewhat similar question here: https://stackoverflow.com/a/61202584/11138259 – sinoroc May 01 '20 at 10:09

2 Answers2

42

I believe using pip-compile from pip-tools is a good practice when constructing your requirements.txt. This will make sure that builds are predictable and deterministic.

The pip-compile command lets you compile a requirements.txt file from your dependencies, specified in either setup.py or requirements.in

Here's my recommended steps in constructing your requirements.txt (if using requirements.in):

  1. Create a virtual env and install pip-tools there
$ source /path/to/venv/bin/activate
(venv)$ python -m pip install pip-tools
  1. Specify your application/project's direct dependencies your requirements.in file:
# requirements.in
requests
boto3==1.16.51
  1. Use pip-compile to generate requirements.txt
$ pip-compile --output-file=- > requirements.txt

your requirements.txt files will have:

#
# This file is autogenerated by pip-compile
# To update, run:
#
#    pip-compile --output-file=-
#
boto3==1.16.51
    # via -r requirements.in
botocore==1.19.51
    # via
    #   boto3
    #   s3transfer
certifi==2020.12.5
    # via requests
chardet==4.0.0
    # via requests
idna==2.10
    # via requests
jmespath==0.10.0
    # via
    #   boto3
    #   botocore
python-dateutil==2.8.1
    # via botocore
requests==2.25.1
    # via -r requirements.in
s3transfer==0.3.3
    # via boto3
six==1.15.0
    # via python-dateutil
urllib3==1.26.2
    # via
    #   botocore
    #   requests

Your application should always work with the dependencies installed from this generated requirements.txt. If you have to update a dependency you just need to update the requirements.in file and redo pip-compile. I believe this is a much better approach than doing pip freeze > requirements.txt which I see some people do.

I guess the main advantage of using this is you can keep track of the actual direct dependencies of your project in a separate requirement.in file

I find this very similar to how node modules/dependencies are being managed in a node app project with the package.json (requirements.in) and package-lock.json (requirements.txt).

alegria
  • 1,290
  • 14
  • 23
4

From my point of view requirements.txt files should list all dependencies, direct dependencies as well as their dependencies (indirect, transient). If for some reason, only direct dependencies are wanted there are tools that can help with that, from a cursory look, pip-chill seems inadequate since it doesn't actually look at the code to figure out what packages are directly imported. Maybe better look at projects such as pipreqs, pigar, they seem to be more accurate in figuring out what the actual direct dependencies are (based on the imports in your code).

But at the end of the day you should curate such lists by hand. When writing the code you choose carefully which packages you want to import, with the same care you should curate a list of the projects (and their versions) containing those packages. Tools can help, but the developer knows better.

sinoroc
  • 18,409
  • 2
  • 39
  • 70