I am using this script for concatenating my reads from the Samples.Each sub-directory has certain R1.fastq.gz files and R2.fastq.gz that I want to merge into one R1.fastq.gz and R2.fastq.gz file.
sourcedir=/sourcepath/
destdir=/destinationpath/
for f in $sourcedir/*
do
fbase=$(basename "$f")
echo "Inside $fbase"
zcat $f/*R1*.fastq.gz | gzip >$destdir/"$fbase"_R1.fastq.gz
zcat $f/*R2*.fastq.gz | gzip >$destdir/"$fbase"_R2.fastq.gz
done
I want to validate that the reads from R1,R2 are concatenated respectively by comparing the total lines from individual fastq.gz files and the total lines in merged file.
wc -l *R1*.fastq.gz (Individual files)
12832112 total
wc -l Sample_51770BL1_R1.fastq.gz (merged file)
Total:10397604
Should not the number be equal in both cases,or is there any other way to validate that the files merged are done correctly?
Also, is there any way to fasten the process?I tried using & from this link How do I use parallel programming/multi threading in my bash script? but its not running at all.
zcat $f/*R1*.fastq.gz | gzip >$destdir/"$fbase"_R1.fastq.gz &
zcat $f/*R2*.fastq.gz | gzip >$destdir/"$fbase"_R2.fastq.gz &