0

I have very large text files of the form below:

>randomheader1 some info flag1
data
moredata
someextradata
>randomheader2 some info flag2
littledata
somedata
>randomheader3 some info flag1
one
two
three
four
>randomheader4 some info flag3
....

I want to get the output of the lines following the line containing flag1 into another file, such as:

>randomheader1 some info flag1
data
moredata
someextradata
>randomheader3 some info flag1
one
two
three
four

I've been reading to find a solution, I've checked this answer, however since the matching patterns I'm looking for are the same characters (namely >), it didn't work. I'm looking for a solution in bash.

evolozzy
  • 13
  • 6

2 Answers2

0

Using awk

awk '{if($0~/^>/){ if($0~/flag1/) {flag="Y"} else {flag=""}} }flag '

Demo:

$cat temp.txt 
>randomheader1 some info flag1
data
moredata
someextradata
>randomheader2 some info flag2
littledata
somedata
>randomheader3 some info flag1
one
two
three
four
>randomheader4 some info flag3
$awk '{if($0~/^>/){ if($0~/flag1/){flag="Y"} else {flag="" } }}flag ' temp.txt
>randomheader1 some info flag1
data
moredata
someextradata
>randomheader3 some info flag1
one
two
three
four
$

Digvijay S
  • 2,665
  • 1
  • 9
  • 21
0

Assuming data file doesn't contain any null character ('\0'), a solution in pure bash could be:

$ cat filter

#!/bin/bash

in_flag=
while IFS= read -r line; do
    case $line in
        \>*\ flag1) in_flag=t ;;
        \>*) in_flag= ;;
    esac
    [[ -n $in_flag ]] && echo "$line"
done

Run it as

./filter < datafile > outfile
M. Nejat Aydin
  • 9,597
  • 1
  • 7
  • 17