0

I have to use Conda and pip together because some packages I need are only available via Conda, whereas others are only available via PyPI.

I'm following this guide carefully to avoid putting my environment in a broken state.

Note the following excerpts:

Running conda after pip has the potential to overwrite and potentially break packages installed via pip. Similarly, pip may upgrade or remove a package which a conda-installed package requires.

Creating conda packages for all additional software needed is a reliably safe method for putting together a data science environment but can be a burden if the environment involves a large number of packages which are only available on PyPI. In these cases, using pip only after all other requirements have been installed via conda is the safest practice.

Only after conda has been used to install as many packages as possible should pip be used to install any remaining software. If modifications are needed to the environment, it is best to create a new environment rather than running conda after pip.

Because of that, I frequently need to remove and recreate my Conda environment.

Here is how I do that:

# Dump the environment to a file
$ conda env export > environment.yml

# Deactivate the environment, so it becomes deletable
$ conda deactivate

# Delete the environment
$ conda env remove -n my-env

# Recreate the environment from the file
$ conda env create -f environment.yml -v

# Activate the new environment
$ conda activate my-env

Is there an easier way to do all of that with one command?

I suppose I could write a shell script, but some of the commands take an arbitrary amount of time to complete, and I don't know how to time everything correctly.

Something like conda env recreate would be ideal.

leifericf
  • 2,324
  • 3
  • 26
  • 37
  • You could add all the packages to an environment.yaml file manually, along with pip packages: https://stackoverflow.com/questions/35245401/combining-conda-environment-yml-with-pip-requirements-txt – Tzane Jun 09 '22 at 07:36
  • My apologies if my question was unclear. I already have a single file (`environment.yml`) containing both conda and pip requirements, so that part is OK. I'm trying to figure out how to delete and recreate my environment with a single command instead of manually executing five commands. – leifericf Jun 09 '22 at 07:39
  • 2
    Why would not a bash script wait for each command to finish before running the next? – Axiomel Jun 09 '22 at 07:47
  • Does it do that automatically? I'm not very familiar with bash scripting, so I presumed (perhaps incorrectly) that it wouldn't handle timing and exceptions automatically. – leifericf Jun 09 '22 at 07:49
  • you can just chain all those commands with `&&` – Josh Friedlander Jun 09 '22 at 07:52
  • Thus I make a fool of myself and learn something new yet again. Thanks for the advice! I think I know what to do. – leifericf Jun 09 '22 at 07:54
  • How often are you doing this? Can you give an example of changes made to the environment? Can the PyPI packages not be added to Conda Forge? Honestly, if the Pip-installed packages are pure Python (no compiled components), this whole rigamarole of recreation can be safely ignored. – merv Jun 09 '22 at 17:44

2 Answers2

2

You can achieve this through cloning your environment first and then adding a sequential command line:

conda create --name new_env_name --clone old_env_name && conda remove --name old_env_name --all

here && means that the second command (to remove the old environment) will only run if the first exits with a successful return code

If you absolutely need to keep the same environment name do like this:

conda create --name env_name2 --clone env_name && conda remove --name env_name --all && conda create --name env_name --clone env_name2 && conda remove --name env_name2 --all

So, basically you clone it to another name (2), delete the old env, re-clone the new to the same previous name, delete to previous clone (2), done!

nferreira78
  • 1,013
  • 4
  • 17
  • Hey, that's neat! I was not aware of the `--clone` option. When I try to run that command, I get an error `EnvironmentLocationNotFound: Not a conda environment: ~/opt/miniconda3/envs/my-env`, presumably because it's trying to create and delete an environment with the same name in the wrong order (I need to keep the name). – leifericf Jun 09 '22 at 08:20
  • 1
    yeah to clone you have to use different environment names, otherwise it won't work. Also, `EnvironmentLocationNotFound` is flagging that perhaps the name you are passing to clone does not exist? if you delete before cloning, it won't work – nferreira78 Jun 09 '22 at 08:55
  • 1
    please check my edited answer, for a workaround to keep the same environment name, with double cloning – nferreira78 Jun 09 '22 at 09:00
  • Are you sure cloning solves the issue of Conda being in a desynchronized state because of Pip mixing? Cloning will hardlink whatever Conda packages it can and then *copy everything else* (pip-installed or otherwise). I don't think that is the same as recreating from a YAML (which reruns all `conda install` commands; followed by `pip install` commands). – merv Jun 10 '22 at 16:03
  • @merv please point to the documentation where the "hardlink" conda packages takes priority... As far as I have experienced (a lot) all pip packages installed in an environment, will still be exported as pip packages with corresponding versions. Conda packages get installed first and then pip packages. But that's a completw different story from exporting. Cloning is called clonig for a reason... – nferreira78 Jun 12 '22 at 13:48
  • I'm looking at [the code for cloning](https://github.com/conda/conda/blob/0ce67ff01354667d6cf8c956839a401f42e41bde/conda/misc.py#L187). As well as doing a simple test, with verbosity flags. Conda will reinstall the Conda packages, but then just copy everything else (including PyPI packages). That is not the same as the blog recommendation of recreating from YAML, which first installs the Conda packages and then runs `pip install` for the PyPI packages. (The "hardlinking" part is maybe distracting - that's just how Conda installs its packages by default, unless one switches that off.) – merv Jun 12 '22 at 17:39
  • Well, I'm looking at the conda documentation [Cloning an environment](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#cloning-an-environment), which shows "You can make an exact copy of an environment by creating a clone of it" – nferreira78 Jun 13 '22 at 10:03
1

I solved this by writing a shell script conda_env_recreate.sh:

#!/usr/bin/env zsh

env_file='environment.yml'
env_name='my-env'

echo 'Dumping Conda environment to file.'
conda env export --name $env_name > 'new_'$env_file

echo 'Deactivating Conda environment.'
conda deactivate

echo 'Deleting Conda environment.'
conda env remove -n $env_name

echo 'Recreating Conda environment from file.'
conda env create -f $env_file -v

echo 'Reactivating Conda environment.'
conda activate $env_name

# This next step requires Kaleidoscope: https://kaleidoscope.app
if ! cmp -s $env_file 'new_'$env_file
then
    echo 'Comparing old and new Conda environment file.'
    ksdiff $env_file 'new_'$env_file
fi

And then I run it like this:

source conda_env_recreate.sh
leifericf
  • 2,324
  • 3
  • 26
  • 37
  • 2
    you don't need to activate the environment only to export it, and then deactivate again, you can simply do one line like this: `conda env export --name $env_name > $temp_filename` – nferreira78 Jun 10 '22 at 11:08
  • 2
    on another note, once you have your packages settled, why don't you simply keep an environment_file.yml on you code package? that way you can just delete the old environment and create the new one from the same yml file on your system – nferreira78 Jun 10 '22 at 11:12
  • @nferreira78 Thanks for the advice! I'm using many branches in Git. And I install different packages in those other branches. When switching between them, I need to recreate my environment from the file in the active branch. – leifericf Jun 10 '22 at 11:40
  • Also, the environment must be deactivated before I can delete it, or else I get an error message: `CondaEnvironmentError: cannot remove current environment. deactivate and run conda remove again` – leifericf Jun 10 '22 at 11:43
  • Updated the script in my answer after feedback to better suit my workflow. This way, `new_environment.yml` will contain my Conda environment how it looked _before_ recreation. And the environment will be recreated based on the old `environment.yml` in source control. Then I can compare these environment files if necessary and keep the correct one. – leifericf Jun 10 '22 at 12:06
  • 1
    after all feedback I have provided, if you please give a vote up to my answer and comments, that would be really appreciated. My answer intended to feedback a one liner for a command line (double cloning), which is aiming your question "Is there an easier way to do all of that with one command?". Running a bash script is obviously also a clean method, but it's more of a wrapper than a one liner. – nferreira78 Jun 10 '22 at 14:12