0

newbie with awk and trying to write a bash script to use it to print lines between two patterns in a log file and for the life of me I cannot make it work.

I am thinking I need to escape some of the characters.

Here's an example of the section of log I am trying to get lines from:

Processing... AP710  (/var/opt/testsys/rptprint/AP710)
sidjosajdois
sokds3488sds
doskdoskdoskdo
sodk229929
sending entire report to Job Mgr (spool) for user

I want the four lines between the "Processing..." line (first pattern) and the "sending" line (second pattern), and there is only one section of the log that has this above section with both the first pattern line and second pattern line.

I've tried using awk with the following command using a portion of the first pattern, and escaping the "/" characters as needed:

awk '/\/var\/opt\/testsys\/rptprint\/AP710/{flag=1;next}/sending entire report to Job Mgr/{flag=0}flag' log 

But it gives me some other different section of the log that also happens to have the path "/var/opt/testsys/rptprint/AP710", so then I tried changing it to have more of the line (first pattern) by adding "Processing..." and it doesn't return anything....

awk '/Processing\.\.\. AP710 \(\/var\/opt\/testsys\/rptprint\/AP710/{flag=1;next}/sending entire report to Job Mgr/{flag=0}flag' log

Can someone give some guidance about awk so I can get the lines between the 2 patterns? After spending a few hours I am going a little bonkers trying to figure it out, I think my being new to awk is causing me to miss something obvious.

Cheers.

Chris
  • 45
  • 1
  • 10
  • _get lines between the 2 patterns_ is solved here: https://stackoverflow.com/questions/38972736/how-to-print-lines-between-two-patterns-inclusive-or-exclusive-in-sed-awk-or If you have problem with the data, post better sample as that sample won't show the problem. – James Brown May 10 '19 at 19:34

1 Answers1

3

Whenever you find yourself escaping characters in a regexp to make them literal, really consider whether or not you should be using a regexp or if instead you should be doing a string comparison. In fact, always start out with a string comparison and switch to regexp if you need to.

$ awk '
    $0=="sending entire report to Job Mgr (spool) for user" { inSection=0 }
    inSection;
    $0=="Processing... AP710  (/var/opt/testsys/rptprint/AP710)" { inSection=1 }
' file
sidjosajdois
sokds3488sds
doskdoskdoskdo
sodk229929
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • @David no, awk supports `/start/,/stop/` too but it's an inferior solution to the problem of doing something with a block of text between delimiters. sed is stuck with it because sed doesn't have variables so you can't set a flag variable to use the better approach but awk does have variables and so can do it better. In this case, for example you don't have to test for the boundary conditions again to exclude them from the output. Try that with a `/start/,/stop/` approach. – Ed Morton May 10 '19 at 19:56
  • 1
    Thanks Ed, I was thrown for a loop until I looked at the order of your expression and the order of the lines. I was really scratching my head wondering "Now what the heck did he just do...". Your right, with `/start/,/stop/` you will either need another pattern `!{stop}` or a `{d}` with it. – David C. Rankin May 10 '19 at 19:58
  • Yeah you use the order to control which, if any, of the delimiters get printed. `/s/,/e/` is functionally equivalent to `/s/{f=1} f; /e/{f=0}` and prints both delimiting lines but with the flag version you can trivially just shuffle the parts to print just the start delimiter `/s/{f=1} /e/{f=0} f;` or just the end delimiter `f; /s/{f=1} /e/{f=0}` or neither of them `/e/{f=0} f; /s/{f=1}` (i.e. the case above). With the range expression version you have to start embedding additional tests for the things you already tested in the range expression or give up and completely rewrite it. – Ed Morton May 10 '19 at 20:07
  • There are edge cases to consider too where the delimiters appear on the same line as each other - also trivially handled with a flag solution and horrible with a range expression. – Ed Morton May 10 '19 at 20:10
  • 1
    After I snapped to what it was, how it worked was apparent, but not having used `awk` for block printing, it was a really good approach that I wouldn't have just thought of out of the blue. – David C. Rankin May 10 '19 at 20:10
  • Thanks Ed, this seems very close to what I need. I tried what you suggested and now I get lines starting from the “Processing... AP710 (/var/opt/testsys/rptprint/AP710)” pattern and lines after that, but I am getting more lines than desired, as the second pattern “sending entire report to Job Mgr (spool) for user” occurs throughout the log, almost all the way to the end of the file. Can the command be modified to capture lines only from the first occurrence of “Processing…” to the first occurrence of “sending entire report…”? Thanks again! – Chris May 13 '19 at 16:12
  • Of course, it's only software so anythings possible. All we have to go on though is the example you provide in your question so if the same input/output there doesn't adequately represent your real input/output and requirements then update it to do so. Once you've fixed that let me know and I'll take a look. FWIW I don't understand how what you're describing in your comments can happen since in my script the `"sending.."` string turns printing OFF and it'd take another "Processing" line to turn printing on again. AFAIK my script should do what you want as-is. – Ed Morton May 13 '19 at 16:16