5

How do I get snakemake to activate a conda environment that already exists in my environment list?

I know you can use the --use-conda with a .yaml environment file but that seems to generate a new environment which is just annoying when the environment already exists. Any help with this would be much appreciated.

I have tried using the:

conda:
    path/to/some/yamlFile

but it just returns command not found errors for packages in the environment

Lamma
  • 895
  • 1
  • 12
  • 26
  • Are you sure that the command not found error is related to re-using an existing conda environment? Anyways, this is not possible unless you want to get really hacky. – Maarten-vd-Sande Nov 29 '19 at 15:44
  • @Maarten-vd-Sande So you mean I should remove the conda environment if it already exists? – Lamma Dec 02 '19 at 11:49
  • 2
    No, snakemake will install it in the .snakemake folder. Just use --use-conda, and if it's possible, don't worry about the double installment of environments. – Maarten-vd-Sande Dec 02 '19 at 12:47
  • So when I use the `--use-conda` flag I get version errors such as : `ERROR: No matching distribution found for antismash==4.2.0 (from -r /faststorage/project/ABR/Each_reads/.snakemake/conda/condaenv.y5ifh0em.requirements.txt (line 1))` Yet I know version 4.2 is avaliable and `conda search antismash` confirms this – Lamma Dec 02 '19 at 13:35

4 Answers4

3

It is possible. It is essentially an environment config issue. You need to call bash in the snakemake rules and load conda-init'd bash profiles there. Below example works with me:

rule test_conda:
    shell:
        """
        bash -c '
            . $HOME/.bashrc # if not loaded automatically
            conda activate base
            conda deactivate'
        """

In addition, --use-conda is not necessary in this case at all.

liagy
  • 31
  • 4
3

This question is still trending on Google, so an update:

Since snakemake=6.14.0 (2022-01-26) using an existing, named conda environment is a supported feature.

You simply put the name of the environment some-env-name into the rules conda directive (instead of the .yaml file) and use snakemake --use-conda:

rule NAME:
    input:
        "table.txt"
    output:
        "plots/myplot.pdf"
    conda:
        "some-env-name"
    script:
        "scripts/plot-stuff.R"

Documentation: https://snakemake.readthedocs.io/en/stable/snakefiles/deployment.html#using-already-existing-named-conda-environments

Note: It is recommended to use the feature sparsely and prefer to specify a environment.yaml file instead to increase reproduceability.

euronion
  • 1,142
  • 6
  • 14
2

Prefer Snakemake-managed environments

This is an old answer, from before Snakemake added a feature to allow user-managed environments. Other answers cover the newer functionality. Nevertheless, I am retaining this answer here because I believe it adds perspective to the problem, and why this feature is still discouraged from being used. Specifically, from the documentation:

"Importantly, one should be aware that this can hamper reproducibility, because the workflow then relies on this environment to be present in exactly the same way on any new system where the workflow is executed. Essentially, you will have to take care of this manually in such a case. Therefore, the approach using environment definition files described above is highly recommended and preferred." [emphasis in the original]


(Mostly) Original Answer

This wasn't previously possible and I'd still argue it was mostly a good thing. Snakemake having sole ownership of the environment helps improve reproducibility by requiring one to update the YAML instead of directly manipulating the environment with conda (install|update|remove). Note that such a practice of updating a YAML and recreating is a Conda best practice when mixing in Pip, and it definitely doesn't hurt to adopt it generally.

Conda does a lot of hardlinking, so I wouldn't sweat the duplication too much - it's mostly superficial. Moreover, if you create a YAML from the existing environment you wish to use (conda env export > env.yaml) and give that to Snakemake, then all the identical packages that you already have downloaded will be used in the environment that Snakemake creates.


If space really is such a tight resource, you can simply not use Snakemake's --use-conda flag and instead activate your named envs as part of the shell command or script you provide. I would be very careful not to manipulate those envs or at least be very diligent about tracking changes made to them. Perhaps, consider tracking the output of conda env export > env.yaml under version control and putting that YAML as an input file in the Snakemake rules that activate the environment. This way Snakemake can detect that the environment has mutated and the downstream files are potentially outdated.

merv
  • 67,214
  • 13
  • 180
  • 245
  • 1
    For some of the tools I am using they requier set up post instilation from conda, how would I solve this when using the `conda: path/to/some/yamlFile` ? – Lamma Dec 02 '19 at 09:29
  • @Lamma yeah that complicates things. Simplest is just abandon the `--use-conda` flag, as suggested in the answer. Alternatively, you could make a container that has the env pre-created and configured, then use `--use-singularity`. Or, if the post-installation can be automated, one could build a custom Conda package that [runs some post-linking scripts](https://docs.conda.io/projects/conda-build/en/latest/resources/link-scripts.html). Sorry I seem to have missed your comment! – merv Jan 28 '20 at 01:02
  • No worries, thank you for the help :) I will give this a go! – Lamma Jan 28 '20 at 13:13
  • This comment is outdated and false. – crusher083 Dec 12 '22 at 18:07
  • @crusher083 updated; I still stand by my original recommendation and am retaining the answer because I believe it provides useful perspective, as well as some suggestions for how to manually loop back in some reproducibility. – merv Dec 12 '22 at 19:34
2

Follow up to answer by liagy, since snakemake runs with strict bash mode (set -u flag), conda activate or deactivate may throw an error showing unbound variable related to conda environment. I ended up editing parent conda.sh file which contains activate function. Doing so will temporarily disable u flag while activating or deactivating conda environments but will preserve bash strict mode for rest of snakemake workflow.

Here is what I did:

Edit (after backing up the original file) ~/anaconda3/etc/profile.d/conda.sh and add following from the first line within __conda_activate() block:

__conda_activate() {
    if [[ "$-" =~ .*u.* ]]; then
        local bash_set_u
        bash_set_u="on"
        ## temporarily disable u flag
        ## allow unbound variables from conda env
        ## during activate/deactivate commands in
        ## subshell else script will fail with set -u flag
        ## https://github.com/conda/conda/issues/8186#issuecomment-532874667    
        set +u
    else
        local bash_set_u
        bash_set_u="off"
    fi

# ... rest of code from the original script

And also add following code at the end of __conda_activate() block to re-enable bash strict mode only if present prior to running conda activate/deactivate functions.

    ## reenable set -u if it was enabled prior to
    ## conda activate/deactivate operation
    if [[ "${bash_set_u}" == "on" ]]; then
        set -u
    fi
}

Then in Snakefile, you can have following shell commands to manage existing conda environments.

    shell:"""
        ## check current set flags
        echo "$-"
        ## switch conda env
        source ~/anaconda3/etc/profile.d/conda.sh && conda activate r-reticulate
        ## Confirm that set flags are same as prior to conda activate command
        echo "$-"

        ## switch conda env again
        conda activate dev
        echo "$-"
        which R
        samtools --version

        ## revert to previous: r-reticulate
        conda deactivate
        """

You do not need to add above patch for __conda_deactivate function as it sources activate script.

PS: Editing ~/anaconda3/etc/profile.d/conda.sh is not ideal. Always backup the original and edited filed. Updating conda will most likely overwrite these changes.

Samir
  • 724
  • 10
  • 18