2

I am trying to extract the trace of a specific event from log files. To find the relevant event, I look for a $PATTERN$. To extract the complete trace of the event, I am looking to extract lines on either end of the pattern enclosed by a $SEPARATOR$

For example, if the contents of log file is

Line1
Line2
SEP
Line3
Line4
Name=PATTERN
Line5
SEP
Line 6
...

I want to extract

SEP
Line3
Line4
Name=PATTERN
Line5
SEP

I tried to use sed and got it working for single line matches as below:

echo "randomStringSEPrandomPATTERNrandomSEPrandom" | sed -n 's/^.*\(SEP.*PATTERN.*SEP\).*/\1/p'

returns SEPrandomPATTERNrandomSEP

Any help on how to extend it for multiple lines would be much appreciated. Thanks.

Karthik
  • 23
  • 2

3 Answers3

3

This is not a very natural task for sed. Use awk instead.

A gawk-specific version (thanks Jotne for corrections):

gawk -vRS="SEP" '/PATTERN/ {print RT $0 RT}'

A version for POSIX awk. Should work on BSD/OSX.

awk '
  /SEP/ {
    out = out $0 "\n"
    if (in_seps == 1) {
      if (pattern_found) {
        printf(out)
        pattern_found = 0
      }
      in_seps = 0
      out = ""
    } else
      in_seps = 1
    next
  }

  in_seps == 1 {
    out = out $0 "\n"
  }

  /PATTERN/ {
    pattern_found = 1
  }
'

A sed script. Uses GNU extension T (like t but opposite condition).

sed -n '
  H               # append line to holdspace
  /SEP/ {         # if line was a separator
    x             # exchange pattspace and holdspace
    s/^SEP/&/     # check if it begins with a separator
    T             # if it doesn't, go to next line
    s/PATTERN/&/  # check if it contains the pattern
    T             # if it doesn't, go to next line
    p             # print it
  }
'
ooga
  • 15,423
  • 2
  • 20
  • 21
  • Concise and powerful, +1 :) – zx81 Jul 18 '14 at 00:26
  • 2
    and just mention its GNU awk specific due to the multi-char RS. Any other awk would just use `S` as the RS given that command. – Ed Morton Jul 18 '14 at 01:55
  • +1; also works with `mawk`, the default on Ubuntu (but not with FreeBSD/OSX `awk`). – mklement0 Jul 18 '14 at 03:25
  • Just a small note. The `gnu awk` does not give exactly what OP requested. It should also print the `SEP`. You could do `gawk '/PATTERN/ {print RT $0 RT}' RS="SEP" file` This will also get rid of the blank line at the end that you get. – Jotne Jul 18 '14 at 06:22
1

Here is an awk that should work with most version of awk

awk '{a[NR]=$0} s && /^SEP/ {e=NR;next} /^SEP/ {s=NR} /PATTERN/ {f=NR} END {if (f>s && f<e) for (i=s;i<=e;i++) print a[i]}' file
SEP
Line3
Line4
Name=PATTERN
Line5
SEP

How it works

awk '
    {a[NR]=$0}              # Store all line in an array "a"
s && /^SEP/ {               # If flag "s" is true and line starts with "SEP" do
    e=NR                    # set end flag "e" to "NR"
    next}                   # and skip to next line
/^SEP/ {                    # If line starts with "SEP" do
    s=NR}                   # set start flag "s" to "NR"
/PATTERN/ {                 # If line contains "PATTERN" do
    f=NR}                   # set flag "f" to "NR"
END {                       # END section
    if (f>s && f<e)         # If "f" flag is larger than "s" flag and less than "e" flag (pattern within range) do
        for (i=s;i<=e;i++)  # Loop from "s" to "e"
            print a[i]}     # and print the array "a" from this position
    ' file
Jotne
  • 40,548
  • 12
  • 51
  • 55
0

or i miss the purpose or it is a trivial task for sed (from your remark) if like on your sample BUT not if separator is on the same line (like your test)

sed -n "/${Separator}/,/${Separator}/ {
   H;g
   /\n${Separator}.*${Separator}$/ {
      s/.\(.*${pattern}.*\)/\1/p
      s/.*//;h
      }
   }" YourFile

Assuming Separator does not contain special (Reduced)RegEx char/meaning (not the case with word only content, even alphanum)

If on same line, see other sed reply

NeronLeVelu
  • 9,908
  • 1
  • 23
  • 43
  • You missed the difficult part. You're only supposed to print what's between the separators if what's between them also contains PATTERN. – ooga Jul 18 '14 at 14:19
  • sorry for that, changed the code in post taking that in consideration – NeronLeVelu Jul 21 '14 at 05:24