Using awk to extract lines between patterns

Question

I am trying to use awk to extract lines between two patterns, but as the second pattern involves multiple $$$$ sign I do not manage to protect properly the $

input is:

Name1
iii
iii
$$$$

Name2
ii
ooo
ppp
$$$$

Name3
pp
oo
ii
uu
$$$$

desired output

Name2
ii
ooo
ppp
$$$$

I tried something like this:

awk 'BEGIN {RS="\\$\\$\\$\\$\n"; FS="\n"} $1==Name2 {print $0; print "$$$$"}' inputfile

I also tried something like

awk '/^Name2/,/\\$\\$\\$\\$\\/' input

I tried many different protection of $ but i do it wrong, either nothing is printed or it prints the entire file

Many thanks for suggestions

I put a return after each $$$$ and tried something like awk 'BEGIN {RS="\\$\\$\\$\\$\n"; FS="\n"} $1==Name2 {print $0; print "$$$$"}' — BBVV94, Dec 09 '17 at 11:34
Possible duplicate of [How to select lines between two patterns?](https://stackoverflow.com/questions/38972736/how-to-select-lines-between-two-patterns) — James Brown, Dec 09 '17 at 12:57

karakfa · Answer 1 · 2017-12-09T14:41:03.727

2

you don't have to use patterns if you're looking for a literal match

awk '$0=="Name2",$0=="$$$$"' file

will extract the lines between the two literals. Or a combination if the first match is a pattern

awk '/Name2/,$0=="$$$$"' file

edited Dec 09 '17 at 14:41

answered Dec 09 '17 at 14:33

karakfa

66,216
7
41
56

Interesting! Never knew that – hek2mgl Dec 09 '17 at 15:03
1

s/pattern/regexp/g. patterns are for quilts and sweaters. – Ed Morton Dec 09 '17 at 16:07

score 1 · Answer 2 · answered Dec 09 '17 at 11:48

1

Awk solution:

awk '/^Name2/{ f=1 }f; $1=="$$$$"{ f=0 }' file

f variable is a marker indicating the possibility for the current line to be printed

The output:

Name2
ii
ooo
ppp
$$$$

answered Dec 09 '17 at 11:48

RomanPerekhrest

88,541
4
65
105

hek2mgl · Answer 3 · 2017-12-10T14:25:46.357

1

Instead of using $$$$ as the record separator, you may use the double empty lines, meaning splitting into paragraphs:

awk 'BEGIN{RS=""}/Name2/' file

RS="" is a special value. From the awk man page:

If RS is set to the null string, then records are separated by blank lines. When RS is set to the null string, the newline character always acts as a field separator, in addition to whatever value FS may have.

While the above code is fine for your example, you get into troubles when there are keys like Name20. Meaning a regex match might not be the right approach. An string comparison would probably be a better suite:

awk 'BEGIN{RS="";FS="\n"} $1 == "Name2"' file

I'm explicitly setting FS="\n" to avoid splitting within single lines.

edited Dec 10 '17 at 14:25

answered Dec 09 '17 at 11:53

hek2mgl

152,036
28
249
266

1

The is the right approach but I'd use `RS=""; FS="\n"; ... $1=="Name2"` for robustness. – Ed Morton Dec 09 '17 at 16:09
1

Nice to hear that! :) It took me a moment, but I get your point about explicitly setting `FS="\n"`. I was actually under the impression that FS is the newline, but true is that the newline will be added to FS. Anyhow, since I'm using `/Name2/` and not `$1=="Name2"` it should be ok. Isn't it? But I admit that, depending on the use case, the latter might be the better solution. It really depends. – hek2mgl Dec 09 '17 at 17:12
1

The problem with `/Name2/` vs `$1=="Name2"` is that because the former is doing a partial regexp match across the whole record instead of an exact string match on a specific field like the latter is doing, the former will falsely match if/when the text "Name2" appears in various other contexts. As a very probable example - what if a record started with `Name20`? Also consider `I hate that guy Name2` appearing in one of the records for `Name3`, etc. – Ed Morton Dec 09 '17 at 22:52
1

Many thanks for your explanation! Now it sounds simple. Man! name20, how could I overlook that... Let me change the post tomorrow, it's late here. – hek2mgl Dec 10 '17 at 00:32

Using awk to extract lines between patterns

3 Answers3