1

I have a tabular file

 V1      V2      V3   V4      V5 V6       V7      V8      V9
chr1 3670715 3671052  338 3670940  8  4.18708 3.36070 2.11284
chr1 3671795 3672053  259 3671953 14  7.60682 4.53642 5.15603
chr1 4491782 4493687 1906 4491915 20 11.42107 5.49862 8.69791
chr1 4491782 4493687 1906 4492254 18  8.58343 4.41588 6.05103
chr1 4491782 4493687 1906 4492555 11  5.49023 3.77545 3.25097
chr1 4491782 4493687 1906 4492907 16  8.45705 4.66761 5.94094

I am applying multiple filtering on the file using user inputted values but the result is not correct.

my shell script

# Run : sh filter.sh 5 10 20 
fc=$2; pVal=$3; tags=$4
sed 1,29d $1 | awk '$6>int("'$tags'") && $7>int("'$pVal'") && int($8)>int('"$fc"')' | wc -l 

I am using 3 values and filtering the file (first 29 lines are header) but the output is wrong. I checked in R, it should be 18967 but using above I am getting 13608. I used the int function around the value variables but in vain. Should I reformat my variable values or what am I missing.

Thanks

Sukhi
  • 826
  • 1
  • 8
  • 19

1 Answers1

2

It is not math error in awk.

You need to use -v name=val to pass arguments to awk and simplify your awk command:

awk -v tags=$tags -v pVal=$pVal -v fc=$fc '$6>tags && $7>pVal && int($8)>fc'
anubhava
  • 761,203
  • 64
  • 569
  • 643