4

I'm writing a pipeline in Nextflow and want to use multiple different conda (existing) environments to avoid inconsistencies in tool installation and for sharing specific modules of the pipeline. The Nextflow docs state that the best practise is to specify the conda environment in the nextflow.config - see here.. However, the declaration is just process.conda and seems to apply to all processes rather than being process specific.

I know I can just specify an existing conda environment in each process but I'm trying to adhere to the best practises for portability.

As I haven't been able to find any documentation online for this specific issue, I have tried the following declarations in the config file:

profiles {
    conda {
        process.conda = "something" // works but single env for all processes
        fastqc.conda = "something" // where fastqc is the name of the process - FAILS
        process.fastqc.conda = "something" // FAILS
    }
}

I have also tried:

profiles {
    conda {
        process {
            withName: fastqc {
                 process.conda = "something"
            }
        }
    }
}

which also fails with the error: unknown config attribute withName

Interestingly,

process {
        conda {
            withName: fastqc {
                 process.conda = "something"
            }
        }
    }

does allow me to run different conda environments for each process but cannot be turned on and off by the -profile option (because specifying a profile block breaks it).

Ryan
  • 63
  • 2

2 Answers2

1

Not sure if there's a "best practice" exactly, but the usual way I think is to create a separate Conda configuration file and use the withName or withLabel process selectors to specify the environment using the conda directive. For example, the contents of conf/conda.config might look like:

process {

    withLabel: 'fastqc' {
        conda = 'fastqc=0.11.8=1'
    }

    withName: 'cutadapt' {
        conda = 'cutadapt=2.10=py37h516909a_0'
    }
}

Then, in your nextflow.config, include a 'conda' profile to include the above configuration file and enable the use of Conda environments. Note that the latter is now required in newer versions of Nextflow:

includeConfig 'conf/base.config'

profiles {

    'conda' {
        includeConfig 'conf/conda.config'
        conda.enabled = true
    }

In the above example, the conf/base.config would always be applied, regardless of profile, and might contain the usual cpus/memory/time directives and errorStrategy etc.

Steve
  • 51,466
  • 13
  • 89
  • 103
1

You were very close with your second fenced code block, here's what works for me (version 22.10.6 build 5843):

profiles {
    conda {
        conda.enabled = true
        process {
            withName: fastqc{
                conda = "location/of/env/fastqc"
            }
        }
    }
}
bricoletc
  • 420
  • 3
  • 12