3

Say we call a template script from within Nextflow. How can we pass arguments to that script? For instance, a process defined like this, with no arguments, works fine:

    process callPython {

    input:
        path(someinput)

    output:
        path(someoutput)

    shell:
        template 'myscript.py'

}

However, if I add an argument to the template string, the process fails with a "can't find template file" error. E.g., this fails:

    process callPython {

    input:
        path(someinput)

    output:
        path(someoutput)

    shell:
        template 'myscript.py -someargument'

}

Is there some way to tell Nextflow how to parse the string passed to template properly, so as to allow arguments? Thanks!

jh_
  • 101
  • 5

2 Answers2

2

TL;DR: Consider making your script a "third-party script", by making it executable and moving it into a folder called 'bin' in the root directory of your project repository. Nextflow automatically adds this folder to the $PATH in the execution environment and this would let you easily provide command-line arguments inside of a regular script block.


Typically, template files are used inside a script block with a shell script (e.g. BASH, ZSH) since:

The dollar character ($) is interpreted as a Nextflow variable placeholder when the script is run as a Nextflow template, whereas it is evaluated as a Bash variable when run as a Bash script. This can be very useful to test your script autonomously, i.e. independently from Nextflow execution. You only need to provide a Bash environment variable for each the Nextflow variable existing in your script.

The advantage of template files is that you can test them independently of Nextflow. This is especially helpful if, for example, you already have a bunch of PBS (or SLURM) scripts (where you wouldn't usually provide command line arguments) and would like to use them as template files. For example:

Contents of templates/my_script.pbs:

#!/bin/bash
#PBS -N test_job
#PBS -q default_queue
#PBS -S /bin/bash
#PBS -l walltime=1:00:00
#PBS -l ncpus=1
#PBS -l mem=4gb

set -eu

echo "${greeting} world"

Contents of main.nf:

process test {

    debug true

    input:
    val(greeting) from 'Hello', 'Hola', 'Bonjour'

    script:
    template 'my_script.pbs'
}

Results:

$ nextflow run main.nf 
N E X T F L O W  ~  version 22.04.4
Launching `main.nf` [backstabbing_meitner] DSL1 - revision: 5f66c981d2
executor >  local (3)
[ba/93de8f] process > test (2) [100%] 3 of 3 ✔
Hello world

Bonjour world

Hola world

Note that you could also run the above PBS script using your bash interpreter (since it is basically just a regular shell script with some comments or comment lines at the top):

$ greeting="Hallo" bash templates/my_script.pbs
Hallo world

Or if, in your testing, you need the variable in subsequent commands, export it:

$ export greeting="Hallo"
$ bash templates/my_script.pbs 
Hallo world
$ echo "${greeting} there"
Hallo there

Note that shell blocks also support the use of template files. This is useful when your script contains dollar variables that should not be Nextflow variable placeholders, for example:

Contents of templates/my_script.sh:

#!/bin/bash

set -eu

echo "!{greeting} world" | awk '{ print $1 }'

Contents of main.nf:

process test {

    debug true

    input:
    val(greeting) from 'Hello', 'Hola', 'Bonjour'

    shell:
    template 'my_script.sh'
}

Results:

$ nextflow run main.nf 
N E X T F L O W  ~  version 22.04.4
Launching `main.nf` [tiny_dalembert] DSL1 - revision: 79aeea8b7e
executor >  local (3)
[13/78be26] process > test (3) [100%] 3 of 3 ✔
Hola

Hello

Bonjour

However, shell scripts with exclamation marks (!) to denote variables have limited value outside of Nextflow in my opinion:

$ greeting="Hallo" bash templates/my_script.sh 
!{greeting}

A better way might be to instead make your Python script a "third-party script". Just make it executable and move it into a folder called 'bin' in the root directory of your project repository. Nextflow automatically adds this folder to the $PATH in the execution environment. This lets you use it in a regular script block so that you can provide command-line arguments in the usual way:

Contents of bin/my_script.py:

#!/usr/bin/python
import sys

the_greeting = sys.argv[1]

print(f"{the_greeting} world")

Contents of main.nf:

process test {

    debug true

    input:
    val(greeting) from 'Hello', 'Hola', 'Bonjour'

    script:
    """
    my_script.py "${greeting}"
    """
}

Results:

$ nextflow run main.nf 
N E X T F L O W  ~  version 22.04.4
Launching `main.nf` [backstabbing_banach] DSL1 - revision: 4937773725
executor >  local (3)
[34/0df158] process > test (1) [100%] 3 of 3 ✔
Bonjour world

Hola world

Hello world

Steve
  • 51,466
  • 13
  • 89
  • 103
1

Use nextflow's string interpolation to use a variable from the nextflow/groovy scope into the shell block scope.

   process callPython {

    input:
        path(someinput)

    output:
        path(someoutput)

    shell:
        template 'myscript.py ${params.args}' 
}

Where you call your nextflow pipeline with

nextflow main.nf -args "your python args here"

You could also get your args from a channel or your config file.

Pallie
  • 965
  • 5
  • 10
  • This will produce the same error even if you use `template 'myscript.py !{params.args}' `. The command-line `-args` should also be `--args` i.e. double dashed. – Steve Jul 08 '22 at 03:51