0

Difficult bash script with the somewhat successful for-loop

#!bin/bash
 
HG19BAM=/3568_5891_5871/3568-NRSA-output-M1chip/3568/3568-hg19-bam 

BW=/3568_5891_5871/3568-NRSA-output-M1chip/3568/3568-hg19-bam/3568-BIGWIG_RPM_tools

source /Users/user/opt/anaconda3/etc/profile.d/conda.sh

VAR1="1 2 3 4 5 6 7 8"

VAR2="39.179 27.394 27.623 32.775 25.2577 30.627 27.6229 30.802"

conda activate Bedtools 
 
for i in $VAR1; do for j in $VAR2
do      
echo "Creating minus strand ${i} scaling to ${j}"

bedtools genomecov -ibam ../3568-MB-${i}.hg19.sorted.F4q10.BLfiltered.bam -bg -scale ${j} -strand - -5 > 3568-${i}.strandminus.5.normRPM.bedGraph

sort -k1,1 -k2,2n 3568-${i}.strandminus.5.normRPM.bedGraph > 3568-${i}.strandminus.5.normRPM.sorted.bedGraph

bedGraphToBigWig 3568-${i}.strandminus.5.normRPM.sorted.bedGraph /Users/user/bin/homer/data/genomes/hg19/chrom.sizes \
     3568-${i}.strandminus.5.normRPM.sorted.bw

echo "Creating positive strand ${i} scaling to ${j}"

bedtools genomecov -ibam ../3568-MB-${i}.hg19.sorted.F4q10.BLfiltered.bam -bg -scale ${j} -strand + -5 > 3568-${i}.strandplus.5.normRPM.bedGraph

sort -k1,1 -k2,2n 3568-${i}.strandplus.5.normRPM.bedGraph > 3568-${i}.strandplus.5.normRPM.sorted.bedGraph

bedGraphToBigWig 3568-${i}.strandplus.5.normRPM.sorted.bedGraph /Users/user/bin/homer/data/genomes/hg19/chrom.sizes \
     3568-${i}.strandplus.5.normRPM.sorted.bw

done
done

conda deactivate 

Output:

bash /Users/user/Library/Mobile\ Documents/com\~apple\~TextEdit/Documents/3568-bigiwggenomecov.sh
Creating minus strand 1 scaling to 39.179
Creating positive strand 1 scaling to 39.179
Creating minus strand 1 scaling to 27.394

How do I only pair 1 with 39.179 and 2 with 27.394, etc?

How do I change the for loop so that I do not compare every $VAR1 to every $VAR2?

  • Note that the linked duplicates are using arrays instead of strings for storing lists. That's proper practice in general: `VAR2="39.179 27.394 27.623 32.775 25.2577 30.627 27.6229 30.802"` should be `var2=( 39.179 27.394 27.623 32.775 25.2577 30.627 27.6229 30.802 )`, and then `for i in $VAR2` should instead be `for i in "${var2[@]}"` (if you weren't trying to iterate in lockstep with a second array; since that's the case, you'll want `for idx in "${!var2[@]}"`, and at which point you can refer to `"${var2[$idx]}"`... or also `"${var1[$idx]}"`) – Charles Duffy Jul 15 '22 at 20:25
  • ...see the links up at the top of your question for details. – Charles Duffy Jul 15 '22 at 20:26
  • (use of lower-case variable names is per POSIX recommendation -- see https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html; all-caps names are used for variables with meaning to the shell itself and POSIX-defined tools, whereas other names are reserved for application use; your scripts, in this context, are applications -- in reading that spec, keep in mind that environment variables and regular shell variables share a single namespace). – Charles Duffy Jul 15 '22 at 20:30

1 Answers1

0

If you only need pairs, you do not need a nested loop. One of the approaches would be to transform your VAR* into arrays, then you could iterate over let's say i (that is a number of elements in each array) and take VAR1[i] and VAR2[i] in each step.

So you have to find out:

  • how to transform your input strings to arrays
  • how to iterate over them at the same time

Hope this gives a correct direction to think about

vladtkachuk
  • 656
  • 4
  • 13