Filter tabular file with range expression

Question

here is an examplar file. I want to print the line where the number in column 2 within a range defined by two shell variables.

Test    198     A   0   
Test    199     A   2   
Test    2       A   0
Test    202     A   22  
Test    122859  G   199
Test    198589  A   0

For exemple, if $start=198 and $end=202, I only want those lines:

Test    198 A   0   
Test    199 A   2   
Test    202 A   22

Not

Test    122859      G   **199**
Test    **198**589  A   0

I tried several combination of awk and sed and haven't find one that works properly in my script.

sed -n -e "/\t$start\t/,/\t$end\t/p" file This one was my initial try, working really well except in this case

Test    122859      G   **199**

So I tried with awk and it wasn't succesful, especially to deal with that case:

Test    **198**589  A   0

awk '$2 == "$start", $2 == "$end"' file or awk "$2 ~ /\t$start\t/,/\t$end\t/" file

Is there a way to correct one of these to make it do what I need ?

Thanks

Use the `-v` method from the accepted answer in the above question. If it's not clear how to apply it to your specific problem, let us know. — jas, Feb 27 '20 at 13:55
Indeed I didn't know there was a special way to incorporate shell variables in awk, thanks! It makes the first awk expresioin works (if the file is sorted), returns nothing for the second one. — LauraR, Feb 27 '20 at 14:07

score 0 · Accepted Answer · answered Feb 27 '20 at 13:56

It looks like you are trying to select ranges in the file, but you are really selecting ranges in a single field.

Try with

start=198
end=202
awk -v start="$start" -v end="$end" '$2>=start && $2<=end' file

First, you define the variables to ease its use in the awk code. Then you just have to tell awk to select all lines whose second field is greater or equal than your $start and smaller or equal than your $end.

Filter tabular file with range expression

1 Answers1