How to print lines between two patterns with optional end pattern

Question

I have gone through stack over flow and found these questions

How to print lines between two patterns, inclusive or exclusive (in sed, AWK or Perl)?

Combine multiple lines between flags in one line in AWK

The problem with my question is that there can be another TAG1 without the matching TAG2 like this

file.txt:

aa

TAG1
some right text
TAG2

some text2

TAG1
some text3

TAG1
some text4

TAG1
some right text 2
TAG2

some text4

TAG1
some text5
some text6

expected output:

TAG1
some right text
TAG2

TAG1
some right text 2
TAG2

score 1 · Accepted Answer · answered Nov 03 '22 at 12:23

1

One way is to reverse the input, get TAG2 to TAG1 and then reverse again:

$ tac ip.txt | sed -n '/TAG2/,/TAG1/p' | tac
TAG1
some right text
TAG2
TAG1
some right text 2
TAG2

Another way is to reset and start collecting lines once the first one is found and print only when the second one is found:

$ awk '/TAG1/{f=1; buf=$0; next}
       f{buf=buf ORS $0}
       /TAG2/{if(f) print buf; f=0}' ip.txt
TAG1
some right text
TAG2
TAG1
some right text 2
TAG2

answered Nov 03 '22 at 12:23

Sundeep

23,246
2
28
103

1

Since files can be large, I would say your second solution is better. Thank you for the quick answer. It works!! Can you also please help me to understand your solution What is buf, ORS ? I believe f is a flag here – satya Nov 03 '22 at 12:36
Don't assume, measure ;) `buf` is a variable (to denote a buffer that saves the lines of interest). `ORS` is output record separator (default is newline character), `$0` won't have line endings, so you need to add them manually. `f` is also a variable, used here as a state machine flag. – Sundeep Nov 03 '22 at 15:28

Thor · Answer 2 · 2022-11-04T14:04:59.547

0

Here is an example with GNU sed. Collect the data into pattern space and only print when matching TAG1/TAG2 found:

sed -nE ':a; /TAG1$/ s/.*(TAG1)/\1/; N; /TAG2$/ { /^TAG1/ { G; p; }; z; }; ba'

Or as a stand-alone script with explanation:

parse.sed

:a                            # main-loop
/TAG1$/ s/.*(TAG1)/\1/        # Ensure only one TAG1 
N                             # Read next line
/TAG2$/ {                     # When TAG2 encountered
  /^TAG1/ { G; p; }           # Which started with a TAG1, print
  z                           # Clear out pattern space
}
ba                            # Repeat main-loop

Run it like this:

sed -nEf parse.sed infile

edited Nov 04 '22 at 14:04

answered Nov 03 '22 at 13:44

Thor

45,082
11
119
130

Thanks, this works too! although I couldn't understand a bit :) – satya Nov 03 '22 at 14:51
@satya: added some explanation – Thor Nov 04 '22 at 14:05

How to print lines between two patterns with optional end pattern

2 Answers2