Extracting all lines between and including 2 different delimiters when the second one is repeated

Question

Let's assume the following file:

delimiter_1
1
2
delimiter_22
blah
blah blah
delimiter_3
3
2
delimiter_2

The goal is to extract all lines between and including 'delimiter_3' and 'delimiter_2' to get:

delimiter_3
3
2
delimiter_2

This can be done with:

awk "/^delimiter_3$/{a=1};a;/^delimiter_2$/{exit}" file

However, if 'delimiter_2' is repeated such as:

delimiter_1
1
2
delimiter_2
blah
blah blah
delimiter_3
3
2
delimiter_2

the previous awk command returns an empty result.

Is this an issue with the command or awk?

P.S: I've noticed some other similarly worded questions, but AFAIK, none of them covers the exact same use-case.

EDIT: I've replaced all mentions of 'pattern' with 'delimiter'.

Ed Morton · Accepted Answer · 2021-08-10T17:08:36.643

a;/^pattern_2$/{exit} should be a;a&&/^pattern_2$/{exit} or similar so the pattern_2 comparison and subsequent exit only happens when a is true. The better general approach given non-nested, non-overlapping ranges is this to find all matches:

/start/{f=1} f{print; if (/end/) f=0}

or this to just find the first one:

/start/{f=1} f{print; if (/end/) exit}

Using the variable name a to indicate you found a match is less useful/obvious/clear than using f (also, a is often used for the name of an array).

Never use the word "pattern" when talking about matching text, btw, use regexp or string, whichever one you mean. See How do I find the text that matches a pattern?.

And if you're ever tempted to use a range expression instead of a flag for this, I wouldn't - see Is a /start/,/end/ range expression ever useful in awk?.

Extracting all lines between and including 2 different delimiters when the second one is repeated

1 Answers1