22

In snakemake, what is the recommended way to use the shell() function to execute multiple commands?

tedtoal
  • 1,030
  • 1
  • 10
  • 22

2 Answers2

55

You can call shell() multiple times within the run block of a rule (rules can specify run: rather than shell:):

rule processing_step:
    input:
        # [...]
    output:
        # [...]
    run:
        shell("somecommand {input} > tempfile")
        shell("othercommand tempfile {output}")

Otherwise, since the run block accepts Python code, you could build a list of commands as strings and iterate over them:

rule processing_step:
    input:
        # [...]
    output:
        # [...]
    run:
        commands = [
            "somecommand {input} > tempfile",
            "othercommand tempfile {output}"
        ]
        for c in commands:
            shell(c)

If you don't need Python code during the execution of the rule, you can use triple-quoted strings within a shell block, and write the commands as you would within a shell script. This is arguably the most readable for pure-shell rules:

rule processing_step:
    input:
        # [...]
    output:
        # [...]
    shell:
        """
        somecommand {input} > tempfile
        othercommand tempfile {output}
        """

If the shell commands depend on the success/failure of the preceding command, they can be joined with the usual shell script operators like || and &&:

rule processing_step:
    input:
        # [...]
    output:
        # [...]
    shell:
        "command_one && echo 'command_one worked' || echo 'command_one failed'"
tomkinsc
  • 1,033
  • 9
  • 11
4

Thought I would throw in this example. It maybe isn't a direct answer to the user's question but I came across this question when searching a similar thing and trying to figure out how to run multiple shell commands and run some of them in a particular directory (for various reasons).

To keep things clean you could use a shell script.

Say I have a shell script scripts/run_somecommand.sh that does the following:

#!/usr/bin/env sh
input=$(realpath $1)
output=$(realpath $2)
log=$(realpath $3)
sample="$4"

mkdir -p data/analysis/${sample}
cd data/analysis/${sample}
somecommand --input ${input} --output ${output} 2> ${log}

Then in your Snakemake rule you can do this

rule somerule:
    input:
        "data/{sample}.fastq"
    output:
        "data/analysis/{sample}/{sample}_somecommand.json"
    log:
        "logs/somecommand_{sample}.log"
    shell:
        "scripts/run_somecommand.sh {input} {output} {log} {sample}"

Note: If you are working on a Mac and don't have realpath you can install with brew install coreutils it's a super handy command.

Michael Hall
  • 2,834
  • 1
  • 22
  • 40