0

this is my first time posting so hope I am getting it right.

I am trying to write a simple shell script to print out columns of a dataset if column 2 is between +1000000 and -10000000 of a variable number (snp). Here is what I have:


SNPS="9000000 8000000"

for snp in $SNPS; do 

awk '{if ($1 == 12 && $2 <= $((snp+1000000)) && $2 >= $((snp-1000000))) 
print $1,$2,$3,$4,$5,$6,$7,$8}' file.txt > snp12_"${snp}"
cat snp12_"${snp}" | sort -u -k8 > snp12_"${snp}"_sorted

done

Although I know there are numbers in column 2 that fit these criteria, I am getting an empty resulting file. Any help would be appreciated. Thank you!

nsuser
  • 1
  • `snp` isn't an awk variable, only a shell variable. Use `awk -v snp="$snp"` to pass shell variables to awk. – Charles Duffy May 16 '17 at 18:12
  • BTW, you might benchmark having just one pipeline -- ie. `awk -v snp="$snp" '{if ($1 == 12 && $2 <= $((snp+1000000)) && $2 >= $((snp-1000000))) print $1,$2,$3,$4,$5,$6,$7,$8}' file.txt | tee "snp12_$snp" | sort -u -k8 >"snp12_${snp}_sorted"`; that way `sort` can run while `awk` is still going, rather than waiting for `awk` to completely finish before you start `sort` at all. (OTOH, because that means `sort` needs to read from a FIFO rather than a seekable file handle it adds overhead at a different point, so it's hard to predict what the overall impact would be). – Charles Duffy May 16 '17 at 18:14
  • Thanks Charles! I also had to remove the $ before each mathematical expression so, awk -v snp="$snp" '{if ($1 == 12 && $2 <= ((snp+1000000)) && $2 >= ((snp-1000000))) print $1,$2,$3,$4,$5,$6,$7,$8}' file.txt – nsuser May 16 '17 at 20:24
  • No need for double parens in this case either. – Charles Duffy May 16 '17 at 20:28

0 Answers0