(May 2023 UPDATE) This answer was derived from the one above by @user5994461
You can use pip for package management. Pip is the official built-in package manager for Python.org since Python 3.
pip is not a virtual environment manager.
pip
basics
pip is the default package manager for python
pip is built-in as of Python 3.0
Usage: python3 -m venv myenv; source myenv/bin/activate; python3 -m pip install requests
Packages are downloaded from pypi.org, the official public python repository
It can install precompiled binaries (wheels) when available, or source (tar/zip archive).
Compiled binaries are important because many packages are mixed Python/C/other with third-party dependencies and complex build chains. They MUST be distributed as binaries to be ready-to-use.
advanced
pip can actually install from any archive, wheel, or git/svn repo...
...that can be located on disk, or on a HTTP URL, or a personal pypi server.
pip install git+https://github.com/psf/requests.git@v2.25.0
for example (it can be useful for testing patches on a branch).
pip install https://download.pytorch.org/whl/cpu/torch-1.9.0%2Bcpu-cp39-cp39-linux_x86_64.whl
(that wheel is Python 3.9 on Linux).
when installing from source, pip will automatically build the package. (it's not always possible, try building TensorFlow without the google build system :D)
binary wheels can be python-version specific and OS specific, see manylinux specification to maximize portability.
conda
conda is an open source environment manager AND package manager maintained by the open source community. It is separate from Anaconda, Inc. and does not require a commercial license to use in any business environment.
conda is also bundled into Anaconda Navigator, a popular commercial Python distribution from Anaconda, Inc. Anaconda) that includes most common data science and Python developer libraries ready-to-use.
You will use conda when you use Anaconda Navigator GUI.
Packages may be downloaded from conda-forge, anaconda repo4, and other public and private conda package "channels" (aka repos).
It only installs pre-compiled packages.
conda has its own package format. It doesn't use wheels.
conda install
will install a package.
conda build
will build a package.
conda can build the Python interpreter (and other C packages it depends on). That's how an interpreter is built and bundled for Anaconda Navigator.
conda allows users to install and upgrade the Python interpreter (pip does not).
advanced
Historically, one selling point of conda was to support building and installing binary packages, because pip did not support binary packages very well (until wheels and manylinux2010 spec).
Emphasis on building packages. conda has extensive build settings and it stores extensive metadata, to work with dependencies and build chains.
Some projects use conda to initiate complex build systems and generate a wheel, that is published to pypi.org for pip.
conda emphasizes building and managing virtual environments. conda is by design a programming language-agnostic virtual environment manager. conda can install and manage other package managers such as npm, pip, and other language package managers.
Can I use Anaconda Navigator packages for commercial use?
The new language states that use by individual hobbyists, students, universities, non-profit organizations, or businesses with less than 200 employees is allowed, and all other usage is considered commercial and thus requires a business relationship with Anaconda. (as of Oct 28, 2020)
IF you are a large developer organization, i.e., greater than 200 employees, you are NOT permitted to use Anaconda or packages from Anaconda repository for commercial use, unless you acquire a license.
Pulling and using (properly open-sourced) packages from conda-forge repository do not require commercial licenses from Anaconda, Inc. Developers are free to build their own conda packages using the packaging tools provided in the conda-forge infrastructure.
easy_install/egg
- For historical reference only. DO NOT USE
- egg is an abandoned format of package, it was used up to mid 2010s and completely replaced by wheels.
- an egg is a zip archive, it contains python source files and/or compiled libraries.
- eggs are used with
easy_install
and the first releases of pip.
easy_install
was yet another package manager, that preceded pip and conda. It was removed in setuptools v58.3 (year 2021).
- it too caused a lot of confusion, just like pip vs conda :D
- egg files are slow to load, poorly specified, and OS specific.
- Each egg was setup in a separate directory, an
import mypackage
would have to look for mypackage.py
in potentially hundreds of directories (how many libraries were installed?). That was slow and not friendly to the filesystem cache.
Funfact: The only strictly-required dependency to build the Python interpreter is zlib (a zip library), because compression is necessary to load more packages. Eggs and wheels packages are zip files.
Why so many options?
A good question.
Let's delve into the history of Python and computers. =D
Pure python packages have always worked fine with any of these packagers. The troubles were with not-only-Python packages.
Most of the code in the world depends on C. That is true for the Python interpreter, that is written in C. That is true for numerous Python packages, that are python wrappers around C libraries or projects mixing python/C/C++ code.
Anything that involves SSL, compression, GUI (X11 and Windows subsystems), math libraries, GPU, CUDA, etc... is typically coupled with some C code.
This creates troubles to package and distribute Python libraries because it's not just Python code that can run anywhere. The library must be compiled, compilation requires compilers and system libraries and third party libraries, then once compiled, the generated binary code only works for the specific system and python version it was compiled on.
Originally, python could distribute pure-python libraries just fine, but there was little support for distributing binary libraries. In and around 2010 you'd get a lot of errors trying to use numpy
or cassandra
. It downloaded the source and failed to compile, because of missing dependencies. Or it downloaded a prebuilt package (maybe an egg at the time) and it crashed with a SEGFAULT when used, because it was built for another system. It was a nightmare.
This was resolved by pip and wheels from 2012 onward. Then wait many years for people to adopt the tools and for the tools to propagate to stable Linux distributions (many developers rely on /usr/bin/python
). The issues with binary packages extended to the late 2010s.
For reference, that's why the first command to run is python3 -m venv myvenv && source myvenv/bin/activate && pip install --upgrade pip setuptools
on antiquated systems, because the OS comes with an old python+pip from 5 years ago that's buggy and can't recognize the current package format.
Continuum Analytics (later renamed Anaconda, Inc.) worked on their own solution (released as Anaconda Navigator) in parallel. Anaconda Navigator was specifically meant to make data science libraries easy to use out-of-the-box (data science = C and C++ everywhere), hence they came up with a package manager specifically meant to address building and distributing binary packages, and built the environment manager -- conda -- into their distribution. It was later open-sourced and embraced by professional software engineers who frequently work with mixed-language environments.
If you install any package with pip install xxx
nowadays, it USUALLY just works, until you try to use a package without binaries available for your platform. pip is built into current versions of Python, and it is a recommended way to install Python-only packages.
For mixed-language environments, conda is the way to go, because it will install pre-existing binary packages, regardless of their coding language, and use other package managers from other languages (npm, bower, to install their respective language packages.