Is Nextflow really inconsistent or am I doing something wrong using nf-core/rnaseq?

Question

I want to preface this with I am very new to Nextflow, and if I don't include a key to debugging I am sorry please just let me know.

====================================

Case 1: I tried to run this command:

nextflow run nf-core/rnaseq --aligner histat2 -profile test,docker

But ended up getting this error:

-[nf-core/rnaseq] Pipeline completed with errors-
WARN: To render the execution DAG in the required format it is required to install Graphviz -- See http://www.graphviz.org for more info.
Error executing process > 'NFCORE_RNASEQ:RNASEQ:MULTIQC_CUSTOM_BIOTYPE (RAP1_UNINDUCED_REP2)'

Caused by:
  Process `NFCORE_RNASEQ:RNASEQ:MULTIQC_CUSTOM_BIOTYPE (RAP1_UNINDUCED_REP2)` terminated with an error exit status (1)

Command executed:

  cut -f 1,7 RAP1_UNINDUCED_REP2.featureCounts.txt | tail -n +3 | cat biotypes_header.txt - >> RAP1_UNINDUCED_REP2.biotype_counts_mqc.tsv
  mqc_features_stat.py RAP1_UNINDUCED_REP2.biotype_counts_mqc.tsv -s RAP1_UNINDUCED_REP2 -f rRNA -o RAP1_UNINDUCED_REP2.biotype_counts_rrna_mqc.tsv

Command exit status:
  1

Command output:
  (empty)

Command error:
  cut: RAP1_UNINDUCED_REP2.featureCounts.txt: No such file or directory
  cat: can't open 'biotypes_header.txt': No such file or directory

Work dir:
  /mnt/c/Users/mkozubov/Desktop/nextflow_tutorial/work/e7/5df55125d9662b3c6ee83cdeea9ea9

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

I went to the "work" directory it was telling me, and ran the advertised bash .command.run and it worked fine! Why did it error???

==========================================

Case 2:

Thinking my issue was Docker, I also used Singularity. I ran it as follows, leading to two failures, and one success. Here are the commands and errors:

nextflow run nf-core/rnaseq --aligner hisat2 -profile test,singularity


Caused by:
  Failed to pull singularity image
  command: singularity pull  --name depot.galaxyproject.org-singularity-qualimap-2.2.2d--1.img.pulling.1631227989457 https://depot.galaxyproject.org/singularity/qualima
p:2.2.2d--1 > /dev/null
  status : 255
  message:
    INFO:    Downloading network image
    INFO:    Cleaning up incomplete download: /home/mkozubov/.singularity/cache/net/tmp_601246724
    FATAL:   unexpected EOF

nextflow run nf-core/rnaseq --aligner hisat2 -profile test,singularity -resume


Caused by:
  Failed to pull singularity image
  command: singularity pull  --name depot.galaxyproject.org-singularity-bioconductor-dupradar-1.18.0--r40_1.img.pulling.1631228803940 https://depot.galaxyproject.org/si
ngularity/bioconductor-dupradar:1.18.0--r40_1 > /dev/null
  status : 255
  message:
    INFO:    Downloading network image
    INFO:    Cleaning up incomplete download: /home/mkozubov/.singularity/cache/net/tmp_504979312
    FATAL:   unexpected EOF

nextflow run nf-core/rnaseq --aligner hisat2 -profile test,singularity -resume ecstatic_minsky

WARN: To render the execution DAG in the required format it is required to install Graphviz -- See http://www.graphviz.org for more info.
Completed at: 09-Sep-2021 16:40:43
Duration    : 26m 44s
CPU hours   : 1.5 (29.8% cached)
Succeeded   : 116
Cached      : 64

I realize my second resume probably didn't do anything, but why did resuming my first run fix anything? Why couldn't singularity pull down the image it needed the first time? I am a bit of a noob and I don't really know where to start with debugging a problem like this, any help would be greatly appreciated.

===========================================================

Config file:

========================================================================================
    nf-core/rnaseq Nextflow config file
========================================================================================
    Default config options for all compute environments
----------------------------------------------------------------------------------------
*/

// Global default params, used in configs
params {

    // Input options
    input                      = null

    // References
    genome                     = null
    transcript_fasta           = null
    additional_fasta           = null
    splicesites                = null
    gtf_extra_attributes       = 'gene_name'
    gtf_group_features         = 'gene_id'
    featurecounts_feature_type = 'exon'
    featurecounts_group_type   = 'gene_biotype'
    gencode                    = false
    save_reference             = false

    // UMI handling
    with_umi                   = false
    umitools_extract_method    = 'string'
    umitools_bc_pattern        = null
    save_umi_intermeds         = false

    // Trimming
    clip_r1                    = null
    clip_r2                    = null
    three_prime_clip_r1        = null
    three_prime_clip_r2        = null
    trim_nextseq               = null
    save_trimmed               = false
    skip_trimming              = false

    // Ribosomal RNA removal
    remove_ribo_rna            = false
    save_non_ribo_reads        = false
    ribo_database_manifest     = "${projectDir}/assets/rrna-db-defaults.txt"

    // Alignment
    aligner                    = 'star_salmon'
    pseudo_aligner             = null
    seq_center                 = null
    bam_csi_index              = false
    star_ignore_sjdbgtf        = false
    salmon_quant_libtype       = null
    hisat2_build_memory        = '200.GB'  // Amount of memory required to build HISAT2 index with splice sites
    stringtie_ignore_gtf       = false
    min_mapped_reads           = 5
    save_merged_fastq          = false
    save_unaligned             = false
    save_align_intermeds       = false
    skip_markduplicates        = false
    skip_alignment             = false

    // QC
    skip_qc                    = false
    skip_bigwig                = false
    skip_stringtie             = false
    skip_fastqc                = false
    skip_preseq                = false
    skip_dupradar              = false
    skip_qualimap              = false
    skip_rseqc                 = false
    skip_biotype_qc            = false
    skip_deseq2_qc             = false
    skip_multiqc               = false
    deseq2_vst                 = false
    rseqc_modules              = 'bam_stat,inner_distance,infer_experiment,junction_annotation,junction_saturation,read_distribution,read_duplication'

    // Boilerplate options
    outdir                     = './results'
    publish_dir_mode           = 'copy'
    multiqc_config             = null
    multiqc_title              = null
    email                      = null
    email_on_fail              = null
    max_multiqc_email_size     = '25.MB'
    plaintext_email            = false
    monochrome_logs            = false
    help                       = false
    igenomes_base              = 's3://ngi-igenomes/igenomes'
    tracedir                   = "${params.outdir}/pipeline_info"
    igenomes_ignore            = false
    validate_params            = true
    show_hidden_params         = false
    schema_ignore_params       = 'genomes,modules'
    enable_conda               = false
    singularity_pull_docker_container = false

    // Config options
    custom_config_version      = 'master'
    custom_config_base         = "https://raw.githubusercontent.com/nf-core/configs/${params.custom_config_version}"
    hostnames                  = [:]
    config_profile_description = null
    config_profile_contact     = null
    config_profile_url         = null
    config_profile_name        = null

    // Max resource options
    // Defaults only, expecting to be overwritten
    max_memory                 = '128.GB'
    max_cpus                   = 16
    max_time                   = '240.h'
}

// Load base.config by default for all pipelines
includeConfig 'conf/base.config'

// Load modules.config for DSL2 module specific options
includeConfig 'conf/modules.config'

// Load nf-core custom profiles from different Institutions
try {
    includeConfig "${params.custom_config_base}/nfcore_custom.config"
} catch (Exception e) {
    System.err.println("WARNING: Could not load nf-core/config profiles: ${params.custom_config_base}/nfcore_custom.config")
}

// Load nf-core/rnaseq custom config
try {
    includeConfig "${params.custom_config_base}/pipeline/rnaseq.config"
} catch (Exception e) {
    System.err.println("WARNING: Could not load nf-core/config/rnaseq profiles: ${params.custom_config_base}/pipeline/rnaseq.config")
}

// Load igenomes.config if required
if (!params.igenomes_ignore) {
    includeConfig 'conf/igenomes.config'
} else {
    params.genomes = [:]
}

profiles {
    debug { process.beforeScript = 'echo $HOSTNAME' }
    conda {
        params.enable_conda    = true
        docker.enabled         = false
        singularity.enabled    = false
        podman.enabled         = false
        shifter.enabled        = false
        charliecloud.enabled   = false
    }
    docker {
        docker.enabled         = true
        docker.userEmulation   = true
        singularity.enabled    = false
        podman.enabled         = false
        shifter.enabled        = false
        charliecloud.enabled   = false
    }
    singularity {
        singularity.enabled    = true
        singularity.autoMounts = true
        docker.enabled         = false
        podman.enabled         = false
        shifter.enabled        = false
        charliecloud.enabled   = false
    }
    podman {
        podman.enabled         = true
        docker.enabled         = false
        singularity.enabled    = false
        shifter.enabled        = false
        charliecloud.enabled   = false
    }
    shifter {
        shifter.enabled        = true
        docker.enabled         = false
        singularity.enabled    = false
        podman.enabled         = false
        charliecloud.enabled   = false
    }
    charliecloud {
        charliecloud.enabled   = true
        docker.enabled         = false
        singularity.enabled    = false
        podman.enabled         = false
        shifter.enabled        = false
    }
    test      { includeConfig 'conf/test.config'      }
    test_full { includeConfig 'conf/test_full.config' }
}

// Export these variables to prevent local Python/R libraries from conflicting with those in the container
env {
    PYTHONNOUSERSITE = 1
    R_PROFILE_USER   = "/.Rprofile"
    R_ENVIRON_USER   = "/.Renviron"
}

def trace_timestamp = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss')
timeline {
    enabled = true
    file    = "${params.tracedir}/execution_timeline_${trace_timestamp}.html"
}
report {
    enabled = true
    file    = "${params.tracedir}/execution_report_${trace_timestamp}.html"
}
trace {
    enabled = true
    file    = "${params.tracedir}/execution_trace_${trace_timestamp}.txt"
}
dag {
    enabled = true
    file    = "${params.tracedir}/pipeline_dag_${trace_timestamp}.svg"
}

manifest {
    name            = 'nf-core/rnaseq'
    author          = 'Phil Ewels, Rickard Hammarén'
    homePage        = 'https://github.com/nf-core/rnaseq'
    description     = 'Nextflow RNA-Seq analysis pipeline, part of the nf-core community.'
    mainScript      = 'main.nf'
    nextflowVersion = '!>=21.04.0'
    version         = '3.3'
}

// Function to ensure that resource requirements don't go beyond
// a maximum limit
def check_max(obj, type) {
    if (type == 'memory') {
        try {
            if (obj.compareTo(params.max_memory as nextflow.util.MemoryUnit) == 1)
                return params.max_memory as nextflow.util.MemoryUnit
            else
                return obj
        } catch (all) {
            println "   ### ERROR ###   Max memory '${params.max_memory}' is not valid! Using default value: $obj"
            return obj
        }
    } else if (type == 'time') {
        try {
            if (obj.compareTo(params.max_time as nextflow.util.Duration) == 1)
                return params.max_time as nextflow.util.Duration
            else
                return obj
        } catch (all) {
            println "   ### ERROR ###   Max time '${params.max_time}' is not valid! Using default value: $obj"
            return obj
        }
    } else if (type == 'cpus') {
        try {
            return Math.min( obj, params.max_cpus as int )
        } catch (all) {
            println "   ### ERROR ###   Max cpus '${params.max_cpus}' is not valid! Using default value: $obj"
            return obj
        }
    }
}

=================================

I built the env using conda and it looks like this:

I got this using conda env export

name: nf-core
channels:
  - bioconda
  - conda-forge
  - defaults
dependencies:
  - _libgcc_mutex=0.1=conda_forge
  - _openmp_mutex=4.5=1_gnu
  - alsa-lib=1.2.3=h516909a_0
  - appdirs=1.4.4=pyh9f0ad1d_0
  - attrs=21.2.0=pyhd8ed1ab_0
  - backports=1.0=py_2
  - backports.functools_lru_cache=1.6.4=pyhd8ed1ab_0
  - brotlipy=0.7.0=py37h5e8e339_1001
  - bzip2=1.0.8=h7f98852_4
  - c-ares=1.17.2=h7f98852_0
  - ca-certificates=2021.5.30=ha878542_0
  - cairo=1.16.0=h6cf1ce9_1008
  - cattrs=1.8.0=pyhd8ed1ab_0
  - certifi=2021.5.30=py37h89c1867_0
  - cffi=1.14.6=py37hc58025e_0
  - chardet=4.0.0=py37h89c1867_1
  - charset-normalizer=2.0.0=pyhd8ed1ab_0
  - click=8.0.1=py37h89c1867_0
  - cni=0.8.0=hc0beb16_0
  - cni-plugins=0.9.1=ha8f183a_0
  - colorama=0.4.4=pyh9f0ad1d_0
  - commonmark=0.9.1=py_0
  - coreutils=8.25=1
  - cryptography=3.4.7=py37h5d9358c_0
  - curl=7.78.0=hea6ffbf_0
  - expat=2.4.1=h9c3ff4c_0
  - fontconfig=2.13.1=hba837de_1005
  - freetype=2.10.4=h0708190_1
  - future=0.18.2=py37h89c1867_3
  - gettext=0.19.8.1=h0b5b191_1005
  - giflib=5.2.1=h36c2ea0_2
  - git=2.33.0=pl5321hc30692c_0
  - gitdb=4.0.7=pyhd8ed1ab_0
  - gitpython=3.1.18=pyhd8ed1ab_0
  - graphite2=1.3.13=h58526e2_1001
  - harfbuzz=2.9.1=h83ec7ef_0
  - icu=68.1=h58526e2_0
  - idna=3.1=pyhd3deb0d_0
  - importlib-metadata=4.8.1=py37h89c1867_0
  - importlib_metadata=4.8.1=hd8ed1ab_0
  - itsdangerous=2.0.1=pyhd8ed1ab_0
  - jbig=2.1=h7f98852_2003
  - jinja2=3.0.1=pyhd8ed1ab_0
  - jpeg=9d=h36c2ea0_0
  - jq=1.6=h36c2ea0_1000
  - jsonschema=3.2.0=py37hc8dfbb8_1
  - krb5=1.19.2=hcc1bbae_0
  - lcms2=2.12=hddcbb42_0
  - ld_impl_linux-64=2.36.1=hea4e1c9_2
  - lerc=2.2.1=h9c3ff4c_0
  - libarchive=3.5.2=hccf745f_0
  - libcurl=7.78.0=h2574ce0_0
  - libdeflate=1.7=h7f98852_5
  - libedit=3.1.20191231=he28a2e2_2
  - libev=4.33=h516909a_1
  - libffi=3.3=h58526e2_2
  - libgcc=7.2.0=h69d50b8_2
  - libgcc-ng=11.1.0=hc902ee8_8
  - libglib=2.68.4=h3e27bee_0
  - libgomp=11.1.0=hc902ee8_8
  - libiconv=1.16=h516909a_0
  - libnghttp2=1.43.0=h812cca2_0
  - libpng=1.6.37=h21135ba_2
  - libseccomp=2.4.4=h36c2ea0_0
  - libssh2=1.10.0=ha56f1ee_0
  - libstdcxx-ng=11.1.0=h56837e0_8
  - libtiff=4.3.0=hf544144_1
  - libuuid=2.32.1=h7f98852_1000
  - libwebp-base=1.2.1=h7f98852_0
  - libxcb=1.13=h7f98852_1003
  - libxml2=2.9.12=h72842e0_0
  - lz4-c=1.9.3=h9c3ff4c_1
  - lzo=2.10=h516909a_1000
  - markupsafe=2.0.1=py37h5e8e339_0
  - ncurses=6.2=h58526e2_4
  - nextflow=21.04.0=h4a94de4_0
  - nf-core=2.1=pyh5e36f6f_0
  - oniguruma=6.9.7.1=h7f98852_0
  - openjdk=11.0.9.1=h5cc2fde_1
  - openssl=1.1.1l=h7f98852_0
  - packaging=21.0=pyhd8ed1ab_0
  - pcre=8.45=h9c3ff4c_0
  - pcre2=10.37=h032f7d1_0
  - perl=5.32.1=0_h7f98852_perl5
  - pip=21.2.4=pyhd8ed1ab_0
  - pixman=0.40.0=h36c2ea0_0
  - prompt-toolkit=3.0.20=pyha770c72_0
  - prompt_toolkit=3.0.20=hd8ed1ab_0
  - pthread-stubs=0.4=h36c2ea0_1001
  - pycparser=2.20=pyh9f0ad1d_2
  - pygments=2.10.0=pyhd8ed1ab_0
  - pyopenssl=20.0.1=pyhd8ed1ab_0
  - pyparsing=2.4.7=pyh9f0ad1d_0
  - pyrsistent=0.17.3=py37h5e8e339_2
  - pysocks=1.7.1=py37h89c1867_3
  - python=3.7.10=hffdb5ce_100_cpython
  - python_abi=3.7=2_cp37m
  - pyyaml=5.4.1=py37h5e8e339_1
  - questionary=1.10.0=pyhd8ed1ab_0
  - readline=8.1=h46c0cb4_0
  - requests=2.26.0=pyhd8ed1ab_0
  - requests-cache=0.8.0=pyhd8ed1ab_0
  - rich=10.9.0=py37h89c1867_0
  - setuptools=58.0.4=py37h89c1867_0
  - singularity=3.7.1=hca90b9e_0
  - six=1.16.0=pyh6c4a22f_0
  - smmap=3.0.5=pyh44b312d_0
  - sqlite=3.36.0=h9cd32fc_1
  - squashfs-tools=4.4=h6b73730_2
  - tabulate=0.8.9=pyhd8ed1ab_0
  - tk=8.6.11=h27826a3_1
  - typing_extensions=3.10.0.0=pyha770c72_0
  - url-normalize=1.4.3=pyhd8ed1ab_0
  - urllib3=1.26.6=pyhd8ed1ab_0
  - wcwidth=0.2.5=pyh9f0ad1d_2
  - wheel=0.37.0=pyhd8ed1ab_1
  - xorg-fixesproto=5.0=h7f98852_1002
  - xorg-inputproto=2.3.2=h7f98852_1002
  - xorg-kbproto=1.0.7=h7f98852_1002
  - xorg-libice=1.0.10=h7f98852_0
  - xorg-libsm=1.2.3=hd9c2040_1000
  - xorg-libx11=1.7.2=h7f98852_0
  - xorg-libxau=1.0.9=h7f98852_0
  - xorg-libxdmcp=1.1.3=h7f98852_0
  - xorg-libxext=1.3.4=h7f98852_1
  - xorg-libxfixes=5.0.3=h7f98852_1004
  - xorg-libxi=1.7.10=h7f98852_0
  - xorg-libxrender=0.9.10=h7f98852_1003
  - xorg-libxtst=1.2.3=h7f98852_1002
  - xorg-recordproto=1.14.2=h7f98852_1002
  - xorg-renderproto=0.11.1=h7f98852_1002
  - xorg-xextproto=7.3.0=h7f98852_1002
  - xorg-xproto=7.0.31=h7f98852_1007
  - xz=5.2.5=h516909a_1
  - yaml=0.2.5=h516909a_0
  - zipp=3.5.0=pyhd8ed1ab_0
  - zlib=1.2.11=h516909a_1010
  - zstd=1.5.0=ha95c52a_0

score 1 · Answer 1 · answered Sep 11 '21 at 14:12

Sometimes jobs fail for various reasons and Nextflow pipelines can handle these errors differently, for better or worse. The nf-core/rnaseq pipeline (version 3.3) uses the following errorStrategy:

    errorStrategy = { task.exitStatus in [143,137,104,134,139] ? 'retry' : 'finish' }
    maxRetries    = 1
    maxErrors     = '-1'

https://github.com/nf-core/rnaseq/blob/3.3/conf/base.config#L17-L19

Note that the value for maxRetries is only applied when using the 'retry' error strategy.

You get a 'No such file or directory' in your 'Case 1' because the input files weren't staged before it tried to run the script commands. Re-running the .command.run script (as you did) will first try to stage the input files before running the script commands in .command.sh. You should have been able to just -resume the workflow without having to manually intervene and the failed job would have been retried automatically.

The two failures in 'Case 2' look like network errors when pulling the two (different) Singularity images. This could be the result of a weak network connection.

I wouldn't worry too much about errors like these. These aren't uncommon at all. That said, I think the first could be handled better and would just set errorStrategy = 'retry' in your nextflow.config to override the default behavior. I actually find a dynamic retry with backoff (like below) works quite well too. WRT the network issues, it might be worthwhile setting up a cacheDir for Singularity to avoid repeated pulls if you plan on running the pipeline over and over again...

process {

  errorStrategy = {
    sleep( Math.pow( 2, task.attempt ) * 150 as long )
    return 'retry'
  }
  maxRetries = 3
}

singularity {

  cacheDir = '/path/to/containers'
}

Wow thank you for these answers! Do you know how the first error can be prevented? If I am running workflows I would rather not have to manually do a -resume for a workflow that should've ran over night. — Matthew Kozubov, Sep 13 '21 at 17:04
No worries at all! You should be able to override the workflow's errorStrategy by adding `process.errorStrategy = 'retry'` (or the above) to your [nextflow.config](https://www.nextflow.io/docs/latest/config.html). Your nextflow.config can just be a file called `nextflow.config` in your current directory (i.e. the directory from where you run Nextflow). HTH. — Steve, Sep 13 '21 at 23:42

Matthew Kozubov · Answer 2 · 2021-09-16T21:43:50.417

I forgot to mention that I am on a Windows 10 PC with WSL2 configured, and that I was having a strange issue where VMMEM was grabbing all my memory and not letting it go. After messing with Nextflow and pouring through forums looking for the cause of my issues and errors, I realized I am a giant noob and set my .wslconfig file to limit my subsystem to only 2GB of memory, but the default nf-core/rnaseq pipeline asks for 6GB.

This command fixed all my issues:

nextflow run nf-core/rnaseq -profile test,singularity --aligner hisat2 --max_memory 1.5GB

I hope I am correctly identifying the underlying issue, but nf-core/rnaseq now works for me :)

Edit: They even mention that the default resources may not be appropriate here: https://nf-co.re/rnaseq/usage#resource-requests

Is Nextflow really inconsistent or am I doing something wrong using nf-core/rnaseq?

2 Answers2