1

I have a text file test1.txt:

1   first_match
2   not_needed_line1
3   not_needed_line2
4   not_needed_line3
5   second_match
6   not_needed_line4
7   not_needed_line5
8   not_needed_line6
9   not_needed_line7
10  not_needed_line8
11  first_match
12  second_match
13  not_needed_line9
14  not_needed_line10
15  not_needed_line11
16  second_match
17  not_needed_line12
18  not_needed_line13
19  second_match
20  not_needed_line14
21  second_match
22  not_needed_line15
23  not_needed_line16
24  first_match
25  not_needed_line17
26  not_needed_line18
27  second_match

I would like to extract pairs containing "first_match" and "second_match" and add filename test1.txt before each line in result.

In this example it will be lines:

#1 and #5
#11 and #12
#24 and #27

Please note - lines #16, #19 and #21 are not included, because they are missing first matching line from pair "first_match".

I found awk (GNU Awk 3.1.6) script to extract all lines between pairs.

/first_match/{printf FILENAME " - "; f=1} f; /second_match/{f=0}

Result is:

test1.txt - 1   first_match
2   not_needed_line1
3   not_needed_line2
4   not_needed_line3
5   second_match
test1.txt - 11  first_match
12  second_match
test1.txt - 24  first_match
25  not_needed_line17
26  not_needed_line18
27  second_match

Questions:

  1. How to get only pairs containing "first_match" and "second_match"?
test1.txt - 1   first_match
test1.txt - 5   second_match
test1.txt - 11  first_match
test1.txt - 12  second_match
test1.txt - 24  first_match
test1.txt - 27  second_match
  1. How to get only second line from pair - "second_match"?
test1.txt - 5   second_match
test1.txt - 12  second_match
test1.txt - 27  second_match
Tax Max
  • 83
  • 1
  • 1
  • 6
  • 1
    Don't ask 2 questions at a time - [edit] your question to just ask one question so we can help you with that and you can ask a followup if you can't figure out the next step. – Ed Morton Oct 27 '21 at 17:37
  • Thanks, Ed for your comment. I tried to adapt your great explanation in question - https://stackoverflow.com/questions/17908555/printing-with-sed-or-awk-a-line-following-a-matching-pattern, but somehow didn't manage - lack of experience with awk. – Tax Max Oct 28 '21 at 06:49
  • You're welcome. If you do what I suggested and make this 1 question instead of 2 then we can help you. – Ed Morton Oct 28 '21 at 13:23

2 Answers2

2

To print both "first_match" and "second_match" pairs:

awk '
    /first_match/ && !f {print FILENAME, "-", NR, $0; f=1}
    /second_match/ && f {print FILENAME, "-", NR, $0; f=0}
' test1.txt

Output:

test1.txt - 1 first_match
test1.txt - 5 second_match
test1.txt - 11 first_match
test1.txt - 12 second_match
test1.txt - 24 first_match
test1.txt - 27 second_match

To print only the "second_match" of the pair:

awk '
    /first_match/ && !f {f=1}
    /second_match/ && f {print FILENAME, "-", NR, $0; f=0}
' test1.txt

Output:

test1.txt - 5 second_match
test1.txt - 12 second_match
test1.txt - 27 second_match

[Edit]
As Ed Morton points out, the "both" version above prints first_match even if there is no corresponding second_match. Here is a strict version:

awk '
  /first_match/ && !f {l1 = FILENAME " - " NR " " $0; f=1}
  /second_match/ && f {print l1; print FILENAME, "-", NR, $0; f=0}
' test1.txt
tshiono
  • 21,248
  • 2
  • 14
  • 22
0

This prints the "second_match onlys" at the end of the output

gawk '
/first_match/{
  cnt=1
  old=$0
}
/second_match/{
  if (cnt==1){
    print FILENAME, "-", old;
    print FILENAME,"-",$0
  }else{
    only[++o]=FILENAME" - "$0
  }
  cnt=0
}
END{
  print "\nonlys";
  for(i=1;i<=o;i++)
    print only[i]
}' 
sroush
  • 5,375
  • 2
  • 5
  • 11