0

The following snakemake rule fails when I execute it with snakemake -r -p --jobs 40 --cluster "qsub"

rule raven_assembly:
    """
    Assemble reads with Raven v1.5.0
    """
    input:
        "results/01_pooled_reads/eb_flongle_reads_pooled.fastq.gz"
    output:
        assembly="results/eb_raven_assembly.fasta",
    shell:
        """
        zcat {input} | head -n 2 > {output.assembly} 1> out.txt 2> errors.txt
        """

As you can probably tell the original rule was calling the software raven, but I've been simplifying the rule to investigate the source of the job failure.

The corresponding error message:

Error in rule raven_assembly:
    jobid: 1
    output: results/eb_raven_assembly.fasta
    shell:

        zcat results/01_pooled_reads/eb_flongle_reads_pooled.fastq.gz | head -n 2 > results/eb_raven_assembly.fasta &> errors.txt

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
    cluster_jobid: Your job 183238 ("snakejob.raven_assembly.1.sh") has been submitted

Error executing rule raven_assembly on cluster (jobid: 1, external: Your job 183238 ("snakejob.raven_assembly.1.sh") has been submitted, jobscript: /misc/scratch3/jmartijn/snakemake-test/.snakemake/tmp.95wb9rak/snakejob.raven_assembly.1.sh). For error details see the cluster log and the log files of the involved rule(s).

The out.txt file actually returns the expected zcat output, while the errors.txt is an empty file. If I run the zcat command manually, it works fine and returns an 0 exit status.

The jobscript disappears as soon as the snakemake workflow closes, but if I check it while it is still attempting to run it looks like this

#!/bin/sh
# properties = {"type": "single", "rule": "raven_assembly", "local": false, "input": ["results/01_pooled_reads/eb_flongle_reads_pooled.fastq.gz"], "output": ["results/eb_raven_assembly.fasta"], "wildcards": {}, "params": {}, "log": [], "threads": 1, "resources": {"mem_mb": 10903, "disk_mb": 10903, "tmpdir": "/tmp"}, "jobid": 1, "cluster": {}}
cd '/misc/scratch3/jmartijn/snakemake-test' && /scratch2/software/anaconda/envs/proj-ergo/bin/python3.7 -m snakemake --snakefile '/misc/scratch3/jmartijn/snakemake-test/Snakefile' 'results/eb_raven_assembly.fasta' --allowed-rules 'raven_assembly' --cores 'all' --attempt 1 --force-use-threads  --wait-for-files '/misc/scratch3/jmartijn/snakemake-test/.snakemake/tmp.ka4jh42u' 'results/01_pooled_reads/eb_flongle_reads_pooled.fastq.gz' --force --keep-target-files --keep-remote --max-inventory-time 0 --nocolor --notemp --no-hooks --nolock --ignore-incomplete --skip-script-cleanup  --conda-frontend 'mamba' --wrapper-prefix 'https://github.com/snakemake/snakemake-wrappers/raw/' --printshellcmds  --latency-wait 5 --scheduler 'ilp' --scheduler-solver-path '/scratch2/software/anaconda/envs/proj-ergo/bin' --default-resources 'mem_mb=max(2*input.size_mb, 1000)' 'disk_mb=max(2*input.size_mb, 1000)' 'tmpdir=system_tmpdir' --mode 2 && touch '/misc/scratch3/jmartijn/snakemake-test/.snakemake/tmp.ka4jh42u/1.jobfinished' || (touch '/misc/scratch3/jmartijn/snakemake-test/.snakemake/tmp.ka4jh42u/1.jobfailed'; exit 1)

The computer cluster is running SGE 8.1.9 and has Ubuntu 18.04 LTS as OS. Snakemake version 7.8.0

  • 1
    `head` is the likely cause. See this [stackoverflow post](https://stackoverflow.com/q/46569236/3998252). – Manavalan Gajapathy May 30 '22 at 19:21
  • You are right. I had replaced my raven (another software) call with a simpler bash command, but apparently it wasn't as simple as that because of the pipe. If I replaced it with `touch {output.assembly}` it worked OK. In any case, my original problem was apparently that the $PATH wasn't passed on to the job. Simple bash commands worked, but raven wasn't in the $PATH in the job. The solution was to add the -V flag in the qsub command. – Joran Martijn May 31 '22 at 15:28

0 Answers0