4

I have a Nextflow process that uses a bash script (check_bam.sh) to generate a text file. The only options for the contents of that text file are either a 0 or any other number. I would like to extract that 0 or the other value and save it to a Nextflow variable, to be able to use a conditional, in the way that if the content of the file is a 0, the Nextflow script should skip some processes, and if it's any other number that is not zero, the execution should be carried out completely. I am not having problems with the use of Nextflow conditionals and setting channels to empty, but in the part of saving that value that is generated inside the script part into a Nextflow variable to use outside processes.

The process that generates the file (result_bam.txt) with the 0 or other number is as follows (I have simplified it to make it as clear as possible):

process CHECK_BAM {

input:
path bam from channel_bam

output:
path "result_bam.txt"
path "result_bam.txt" into channel_check_bam

script:
"""
bash bin/check_bam.sh $bam > result_bam.txt
"""

What I am checking is the number of mapped reads in the BAM file, and I would like to save that number into a Nextflow variable because if the number is zero, the execution should skip most of the following processes, but if the number is different than zero, it means that there are mapped reads in the file and the execution should continue as intended.

I have thought that maybe using cat result_bam.txt > $FOO or FOO=``cat result_bam.txt` could be a solution but I don't know how to properly save it so the variable is usable between processes.

zx8754
  • 52,746
  • 12
  • 114
  • 209
Cris Tuñí
  • 107
  • 1
  • 9

2 Answers2

3

Use an env channel to grab the data from FOO=``cat result_bam.txt and turn it into a channel.

Pallie
  • 965
  • 5
  • 10
2

Few things come into my mind there, hopefully I understand your problem well. Is check_bam.sh only counting lines of BAM file?


The first option for me would be to check if there is possibility for you, to check if the BAM file has content from your pipeline. This might be useful: countLines_documentation. You should be cautious here, as huge BAM file can lead to memory exception (countLines "loads" the file).


Second option, maybe better, is to pass file result_bam.txt into channel channel_check_bam, and then, following process should be run regarding if the content of file (the number in file result_bam.txt) is greater than 0. So, when you are connecting this channel to other process, you should read the content as:

input:
  val bam_lines from channel_check_bam.map{ it.readLines() } // Gives a list of lines, so 1st line will be your number of mapped reads.

when:
  bam_lines[0].toInteger() > 0 

This way it should be run only when number in result_bam.txt is > 0.
I was testing that with DSL2, so the code might need some little changes - but it works.


Cris Tuñí - Edit: 08/24/2021

Thanks to the help of DawidGaceck I could edit my processes to run only when the number in the file was different than zero. My code ended looking like this:

process CHECK_BAM {

input:
path bam from channel_bam

output:
path "result_bam.txt"
path "result_bam.txt" into channel_check_bam_process1,
                           channel_check_bam_process2

script:
"""
bash bin/check_bam.sh $bam > result_bam.txt
"""

process PROCESS1 {

input:
  val bam_lines from channel_check_bam_process1.map{ it.readLines() } 

when:
  bam_lines[0].toInteger() > 0 

script:
"""
foo bar baz
"""

Hope this helps anyone with the same question or a similar issue!

zx8754
  • 52,746
  • 12
  • 114
  • 209
Dawid Gacek
  • 544
  • 3
  • 19
  • Hi! Thank your for your answer! Exactly, the `check_bam.sh` script is counting lines so when the number of lines equals a zero, it means the BAM file is empty. I will look into the `countLines` method regardless, but your solution using the `when` statement with the other processes seems like it can be better since I would not have to use huge `if else` statements. I will edit my answer once tested and let you know, thanks again! – Cris Tuñí Aug 24 '21 at 11:24
  • Can't edit the message but already tried the second solution and worked like a charm! Thank you very much :) – Cris Tuñí Aug 24 '21 at 12:14
  • 1
    No problem, I'm glad that I could help! – Dawid Gacek Aug 24 '21 at 13:02