19

I have a git repo with my project. I change my conda environment quite frequently, so I want my repo to track changes in the environment, and to be able to push the most recent one and pull it in another computer. Is it possible? I search and find several solutions (e.g. https://tdhopper.com/blog/my-python-environment-workflow-with-conda/) but none provide an automatic changes-tracking.

Meaning, I want to include any changes I make in my environment into the project's repository. Like adding new packages etc. So that when I git pull it in another computer, the new package will be also pulled and added to the environment.

Kunal Vohra
  • 2,703
  • 2
  • 15
  • 33
Cranjis
  • 1,590
  • 8
  • 31
  • 64
  • 1
    It might be possible. Add more details to your question and maybe you will get an answer. – Tim Biegeleisen Aug 27 '18 at 11:14
  • @TimBiegeleisen what is missing? – Cranjis Aug 27 '18 at 11:22
  • @okuoub If you have some changes, you commit and push it to github from your machine. Then you use git pull to bring in those changes on your other computer - this is the normal way to use git. But perhaps you are asking about something different - but it is not very clear what you want to accomplish – nos Aug 27 '18 at 11:45
  • @nos I want to include any changes I make in my environment into the project's repository. Like adding new packages, etc. So that when I git pull it in another computer, the new package will be also pulled and added to the environment. – Cranjis Aug 27 '18 at 12:10
  • @okuoub Ok, that's important, so add that to your question. I don't know conda well enough to answer it though. – nos Aug 27 '18 at 13:44

2 Answers2

25

I use git hooks to make conda environment updates automatic. You can have more information on git hooks here.

The idea here is to have two git hooks:

  • One which detects if a change in your local conda environment occured and if so, create a new commit with the updated env.yml file (I chose a pre-push hook for this one).
  • One which detects a change in env.yml file after a pull (i.e. the remote env.yml was different than the local one and was merged, I chose a post-merge hook for this one)

As described in the documentation, when a git repository is initiated, a folder .git/hooks is created and filled with example scripts. To use one of them, you only have to edit the file, rename it to remove its extension (.sample) and make sure it is executable.

NOTE: I use zsh as shell but the script should be the same in bash (please comment if not), you would just need to change the shebang line.


pre-push hook

  • Rewrite the pre-push.sample file already present in .git/hooks (replace <ENV_NAME> by the name of your conda environment):
#!/usr/bin/env zsh

echo "\n==================== pre-push hook ===================="

# Export conda environment to yaml file
conda env export -n <ENV_NAME> > env.yml

# Check if new environment file is different from original 
git diff --exit-code --quiet env.yml 

# If new environment file is different, commit it
if [[ $? -eq 0 ]]; then
    echo "Conda environment not changed. No additional commit."
else
    echo "Conda environment changed. Commiting new env.yml"
    git add env.yml
    git commit -m "Updating conda environment"
    echo 'You need to push again to push additional "Updating conda environment" commit.'
    exit 1
fi
  • Remove its extension .sample and make it executable if necessary (chmod u+x pre-push)

post-merge hook

#!/usr/bin/env zsh

echo "\n==================== post-merge hook ===================="

changed_files="$(git diff-tree -r --name-only --no-commit-id ORIG_HEAD HEAD)"

check_run() {
    echo "$changed_files" | grep --quiet "$1" && eval "$2"
}

echo "Have to update the conda environment"
check_run env.yml "conda env update --file env.yml"
  • And make it executable (chmod u+x post-merge)

What will happen now ?

  • When pushing, if the conda environment changed, a message will show that you have to push again to push the commit with the updated env.yml
  • When pulling, if the pulled env.yml differs from the local env.yml, conda will update the local environment with the newly pulled env.yml.

Limitations

  • In case the environment changed locally, you can see that the updated env.yml is not automatically pushed to remote. I took the advice from this post git commit in pre-push hook.
  • Currently the updating of the conda environment after pull is using a post-merge hook. I don't know how this will be handled in case of rebase for example.
  • No git expert here, maybe there is hooks better suited for these tasks.
  • I noticed a prefix section in the env.yml which give the path to your environment folder on your local machine. After some test, everything seems to run fine but I don't know if this could somehow create conflicts when developing on various machines.

So ... comments, corrections and ideas of improvements are more than welcome !

khourhin
  • 580
  • 5
  • 8
  • This approach didn't work for me. I get the following output every time I enter `git push`: ==================== pre-push hook ==================== .git/hooks/pre-push: 12: [[: not found Conda environment changed. Commiting new env.yml On branch master Your branch is ahead of 'origin/master' by 2 commits. (use "git push" to publish your local commits) nothing to commit, working tree clean You need to push again to push additional "Updating conda environment" commit. error: failed to push some refs to 'git@github.com:amorris28/artificial_ecosystem_selection.git' – Andrew Nov 25 '20 at 22:30
  • I think your line `conda env export -n env.yml` needs a `>` before the `env.yml` – Andrew Nov 25 '20 at 22:31
  • 1
    I think so too, thanks for the comment. I edited accordingly. – khourhin Nov 26 '20 at 20:21
  • While using this with [`pre-commit`](https://pre-commit.com/#post-merge), the post-merge one takes quite a long time to run. Any idea why? – Ander Biguri Jun 19 '23 at 15:43
  • I think this might be related to the "conda update" in the post-merge part. Maybe using tools such as [mamba](https://mamba.readthedocs.io/en/latest/user_guide/mamba.html) can help speed up the process. – khourhin Jun 29 '23 at 07:22
10

In Conda you can create a virtual environment from and export an environment to a file, which can be included in your git repo. If you pull down your repo on a different machine or delete your environment you can run:

conda env create -f=env.yml

When you make changes to your environment, run an export before you add/commit:

conda env export > env.yml
MarshHawk
  • 491
  • 7
  • 13