Loop and process over blocks of lines between two patterns in awk?

Question

This is actually a continued version of thisquestion:

I have a file

1
2
PAT1
3    - first block
4
PAT2
5
6
PAT1
7    - second block
PAT2
8
9
PAT1
10    - third block

and I use awk '/PAT1/{flag=1; next} /PAT2/{flag=0} flag' to extract the blocks of lines.

Extracting them works ok, but I'm trying to iterate over these blooks in a block-by-block fashion and do some processing with each block (e.g. save to file, process with other scripts etc.).

How can I construct such a loop?

score 1 · Accepted Answer · answered Oct 07 '20 at 06:46

1

Problem is not very clear but you may do something like this:

awk '/PAT1/ {
   flag = 1
   ++n
   s = ""
   next
}
/PAT2/ {
   flag = 0
   printf "Processing record # %d =>\n%s", n, s
}
flag {
   s = s $0 ORS
}' file

Processing record # 1 =>
3    - first block
4
Processing record # 2 =>
7    - second block

answered Oct 07 '20 at 06:46

anubhava

761,203
64
569
643

This looks on the right track. What if instead of printing I'd like to save each block (the `s` variable in a separate file named like `record%d` ? – Flo Oct 07 '20 at 07:00
Of course you can do whatever you like. I used `print` just for demo purpose because question just said process a record. Just replace `printf` with `print s > ("record" n); close(("record" n))` – anubhava Oct 07 '20 at 07:23

score 0 · Answer 2 · answered Oct 07 '20 at 11:52

This might work for you (GNU sed):

sed -ne '/PAT1/!b;:a;N;/PAT2/!ba;e echo process:' -e 's/.*/echo "&"|wc/pe;p' file

Gather up the lines between PAT1 and PAT2 and process the collection.

In the example above, the literal process: is printed.

The command to print the result of the wc command for the collection is built and printed.

The result of the evaluation of the above command is printed.

N.B. The position of the p flag in the substitution command is critical. If the p is before the e flag the pattern space is printed before the evaluation, if the p flag is after the e flag the pattern space is post evaluation.

Loop and process over blocks of lines between two patterns in awk?

2 Answers2