0

I am working on a python project that requires a few libraries. This project will further be shared with other people.

The problem I have is that I can't use the usual pip install 'library' as part of my code because the project could be shared with offline computers and the work proxy could block the download.

So what I first thought of was installing .whl files and running pip install 'my_file.whl' but this is limited since some .whl files work on some computers but not on others, so this couldn't be the solution of my problem.

I tried sharing my project with another project and i had an error with a .whl file working on one computer but not the other.

What I am looking for is to have all the libraries I need to be already downloaded before sharing my project. So that when the project is shared, the peers can launch it without needing to download the libraries.

Is this possible or is there something else that can solve my problem ?

clemdcz
  • 99
  • 7
  • 1
    Did you consider using a Pipfile (see https://stackoverflow.com/a/49867443/20599896) – Cpt.Hook Dec 12 '22 at 14:57
  • You could do requirements.txt which is a great workaround and will automatically install the packages that others might need. – Blue Robin Dec 12 '22 at 14:58
  • @BlueRobin Will the requirements.txt require other people to be connected to internet in order to download the libraries ? – clemdcz Dec 12 '22 at 15:10
  • @clemdcz I believe so. – Blue Robin Dec 12 '22 at 15:16
  • ok so it can't solve my problem. Will look into pipfile – clemdcz Dec 12 '22 at 15:19
  • @Cpt.Hook once i create a virtual environnement and install all the libraries in the venv, will it be possible for my colleagues to start the project and have already the libraries installed ? (while being offline) ? – clemdcz Dec 12 '22 at 16:20
  • Depends on how you share the code. If you share it via a git repo for example, then after pulling changes with new dependencies a `pipenv install` is needed. But only during this a network connection is needed (which is needed for the pull anyway). – Cpt.Hook Dec 12 '22 at 16:46
  • And if i share on a usb key, what is needed to do ? – clemdcz Dec 12 '22 at 16:48
  • 1
    If you want to have the environment readily set up for your colleagues, things become very tricky, as you'd need to share the linked libraries between computers (and potentially support different OS, ...). In this case you should share virtual machines, docker environments or something like this... If (and only if) all computers are the same (including system libraries, install paths, etc.) in theory you could just copy the `venv` folder. Hower, this is by far not recommentded! – Cpt.Hook Dec 12 '22 at 16:50
  • 1
    Maybe, you can think of setting up a local [pypi-mirror](https://pypi.org/project/python-pypi-mirror/) ([commercial solution](https://www.jfrog.com/confluence/display/JFROG/PyPI+Repositories)) which is fed with trusted packages via an air-gap mechanism and then allows people in a local non-internet-attached network to pull from there? – Cpt.Hook Dec 12 '22 at 16:57

1 Answers1

1

There are different approaches to the issue here, depending on what the constraints are:


1. Defined Online Dependencies

It is a good practice to define the dependencies of your project (not only when shared). Python offers different methods for this.

In this scenario every developer has access to a pypi repository via the network. Usually the official main mirrors (i.e. via internet). New packages need to be pulled individually from here, whenever there are changes. Repository (internet) access is only needed when pulling new packages.

Below the most common ones:

1.1 requirements.txt

The requirements.txt is a plain text list of required packages and versions, e.g.

# requirements.txt
matplotlib==3.6.2
numpy==1.23.5
scipy==1.9.3

When you check this in along with your source code, users can freely decide how to install it. The mosty simple (and most convoluted way) is to install it in the base python environment via

pip install -r requirements.txt

You can even automatically generate such a file, if you lost track with pipreqs. The result is usually very good. However, a manual cleanup afterwards is recommended.

Benefits:

  • Package dependency is clear
  • Installation is a one line task

Downsides:

  • Possible conflicts with multiple projects
  • Not sure that everyone has the exact same version if flexibility is allowed (default)

1.2 Pipenv

There is a nice and almost complete Answer to Pipenv. Also the Pipenv documentation itself is very good. In a nutshell: Pipenv allows you to have virtual environments. Thus, version conflicts from different projects are gone for good. Also, the Pipfile used to define such an environment allows seperation of production and development dependencies.

Users now only need to run the following commands in the folder with the source code:

pip install pipenv # only needed first time
pipenv install

And then, to activate the virtual environment:

pipenv shell

Benefits:

  • Seperation between projects
  • Seperation of development/testing and production packages
  • Everyone uses the exact same version of the packages
  • Configuration is flexible but easy

Downsides:

  • Users need to activate the environment

1.3 conda environment

If you are using anaconda, a conda environment definition can be also shared as a configuration file. See this SO answer for details.

This scenario is as the pipenv one, but with anaconda as package manager. It is recommended not to mix pip and conda.

1.4 setup.py

When you are anyway implementing a library, you want to have a look on how to configure the dependencies via the setup.py file.


2. Defined local dependencies

In this scenario the developpers do not have access to the internet. (E.g. they are "air-gapped" in a special network where they cannot communicate to the outside world. In this case all the scenarios from 1. can still be used. But now we need to setup our own mirror/proxy. There are good guides (and even comlplete of the shelf software) out there, depending on the scenario (above) you want to use. Examples are:

Benefits:

  • Users don't need internet access
  • Packages on the local proxy can be trusted (cannot be corrupted / deleted anymore)
  • The clean and flexible scenarios from above can be used for setup

Downsides:

  • Network connection to the proxy is still required
  • Maintenance of the proxy is extra effort

3. Turn key environments

Last, but not least, there are solutions to share the complete and installed environment between users/computers.

3.1 Copy virtual-env folders

If (and only if) all users (are forced to) use an identical setup (OS, install paths, uses paths, libraries, LOCALS, ...) then one can copy the virtual environments for pipenv (1.2) or conda (1.3) between PCs.

These "pre-compiled" environments are very fragile, as a sall change can cause the setup to malfunction. So this is really not recommended.

Benefits:

  • Can be shared between users without network (e.g. USB stick)

Downsides:

  • Very fragile

3.2 Virtualisation

The cleanest way to support this is some kind of virtualisation technique (virtual machine, docker container, etc.). Install python and the dependencies needed and share the complete container.

Benefits:

  • Users can just use the provided container

Downsides:

  • Complex setup
  • Complex maintenance
  • Virtualisation layer needed
  • Code and environment may become convoluted

Note: This answer is compiled from the summary of (mostly my) comments

Cpt.Hook
  • 590
  • 13