1

I have a file like this (test.txt):

abc
12
34
def
56
abc
ghi
78
def
90

And I would like to search the 78 which is enclosed by "abc\nghi" and "def". Currently, I know I can do this by:

cat test.txt | awk '/abc/,/def/' | awk '/ghi/,'/def/'

Is there any better way?

Arthur Cheuk
  • 115
  • 1
  • 1
  • 7
  • 1
    though you wanted only *to **search** the 78* , what should be the final output? – RomanPerekhrest Nov 14 '17 at 10:49
  • hmm..good point.. I thought the command OP tried was giving expected output.. but perhaps only lines between are needed, so I've edited my answer – Sundeep Nov 14 '17 at 11:56

5 Answers5

2

One way is to use flags

$ awk '/ghi/ && p~/abc/{f=1} f; /def/{f=0} {p=$0}' test.txt
ghi
78
def
  • {p=$0} this will save input line for future use
  • /ghi/ && p~/abc/{f=1} set flag if current line contains ghi and previous line contains abc
  • f; print input record as long as flag is set
  • /def/{f=0} clear the flag if line contains def


If you only want the lines between these two boundaries

$ awk '/ghi/ && p~/abc/{f=1; next} /def/{f=0} f; {p=$0}' ip.txt
78
$ awk '/12/ && p~/abc/{f=1; next} /def/{f=0} f; {p=$0}' ip.txt
34

See also How to select lines between two patterns?

Sundeep
  • 23,246
  • 2
  • 28
  • 103
0

This is not really clean, but you can redefine your record separator as a regular expression to be abc\nghi\n|\ndef. This however creates multiple records, and you need to keep track which ones are between the correct ones. With awk you can check which RS was found using RT.

awk 'BEGIN{RS="abc\nghi\n|\ndef"}
     (RT~/abc/){s=1}
     (s==1)&&(RT~/def/){print $0}
     {s=0}' file

This does :

  • set RS to abc\nghi\n or \ndef.
  • check if the record is found, if RT contains abc you found the first one.
  • if you found the first one and the next RT contains def, then print.
kvantour
  • 25,269
  • 4
  • 47
  • 72
0

grep alternative

$ grep -Pazo '(?s)(?<=abc\nghi)(.*)(?=def)' file

but I think awk will be better

karakfa
  • 66,216
  • 7
  • 41
  • 56
  • GNU grep only. The `-P` option doesn't work in the BSDs (incl macOS), though `pcregrep` is often available as an add-on package. – ghoti Nov 16 '17 at 03:36
0

You could do this with sed. It's not ideal in that it doesn't actually understand records, but it might work for you...

sed -Ene 'H;${x;s/.*\nabc\nghi\n([0-9]+)\ndef\n.*/\1/;p;}' input.txt

Here's what's basically going on:

  • H - appends the current line to sed's "hold space"
  • ${ - specifies the start of a series of commands that will be run once we come to the end of the file
  • x - swaps the hold space with the pattern space, so that future substitutions will work on what was stored using H
  • s/../../ - analyses the pattern space (which is now multi-line), capturing the data specified in your question, replacing the entire pattern space with the bracketed expression...
  • p - prints the result.

One important factor here is that the regular expression is ERE, so the -E option is important. If your version of sed uses some other option to enable support for ERE, then use that option instead.

Another consideration is that the regex above assumes Unix-style line endings. If you try to process a text file that was generated on DOS or Windows, the regex may need to be a little different.

ghoti
  • 45,319
  • 8
  • 65
  • 104
-1

awk solution:

awk '/ghi/ && r=="abc"{ f=1; n=NR+1 }f && NR==n{ v=$0 }v && NR==n+1{ print v }{ r=$0 }' file

The output:

78

Bonus GNU awk approach:

awk -v RS= 'match($0,/\nabc\nghi\n(.+)\ndef/,a){ print a[1] }' file
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
  • 1
    downvote without a comment doesn't give much point/meaning for possible answer improvement. Therefore, such downvote is pointless – RomanPerekhrest Nov 15 '17 at 12:00