0

Input file

aaa
Any--END--Pattern
bbb
ANY--BEGIN--PATTERN
ccc                   # do not print
ANY--BEGIN--PATTERN   # print 1
ddd                   # print 2
Any--END--Pattern     # print 3
eee
fff
ANY--BEGIN--PATTERN   # print 4
ggg                   # print 5
Any--END--Pattern     # print 6
hhh                   # print 7
Any--END--Pattern     # print 8
iii                   # do not print
ANY--BEGIN--PATTERN
jjj

Wanted output

ANY--BEGIN--PATTERN   # print 1
ddd                   # print 2
Any--END--Pattern     # print 3
ANY--BEGIN--PATTERN   # print 4
ggg                   # print 5
Any--END--Pattern     # print 6
hhh                   # print 7
Any--END--Pattern     # print 8

Notes

  • Print from the latest ANY--BEGIN--PATTERN before the current Any--END--Pattern.
  • Print until the last Any--END--Pattern if no ANY--BEGIN--PATTERN meet.

Many similar questions but cannot find an answer for this issue

The answers I have tested from these questions print the line ccc and/or the line iii... or do not print the lines having the BEGIN and END patterns. My several attempts have these same drawbacks and defects.

We could write a ten lines script, but I am sure there is an elegant one-line command solving this issue but I cannot find it. Therefore I think this could be a good SO question ;-)

I wonder what are the tricks to use from sed, awk, perl or any other tool available easy on our Unix-like systems. Please provide a tiny command line using : , , , , or any other tool you think...


EDIT:

Just to underline the pretty simple command line from Sundeep's comment that simplifies the problem by reversing the input file:

tac input.txt | sed -n '/END/,/BEGIN/p' | tac

But this command line also prints the beginning
(this case may not happen for other users looking a similar issue)

aaa
Any--END--Pattern
ANY--BEGIN--PATTERN   # print 1
ddd                   # print 2
Any--END--Pattern     # print 3
ANY--BEGIN--PATTERN   # print 4
ggg                   # print 5
Any--END--Pattern     # print 6
hhh                   # print 7
Any--END--Pattern     # print 8

(This answer is used within this C++ coding rules)

Community
  • 1
  • 1
oHo
  • 51,447
  • 27
  • 165
  • 200
  • 7
    OK, you've read all those answers...have you made an attempt of your own, based on what you've read? If so, what problems are you facing? – Tom Fenech Sep 26 '16 at 12:49
  • 4
    You phrased this question like a code golf request. Judging from your experience, you should know better. ;) – simbabque Sep 26 '16 at 12:49
  • What do you want then, if you already have those answers? – 123 Sep 26 '16 at 12:49
  • @simbabque I can write a script to handle this, but I think this is possible using a one-line command but I cannot find :-/ I think this could be a good question for SO, don't you ? – oHo Sep 26 '16 at 12:53
  • @123 I have posted too quicly my question. There are many similar questions I have read, but I cannot find an answer for my issue :-/ – oHo Sep 26 '16 at 12:56
  • @olibre I can't work out how the other questions don't answer this? – 123 Sep 26 '16 at 12:59
  • @123 Most of the other answers and my own attempts print the line `ccc`. I am checking your last command of your answer ;-) – oHo Sep 26 '16 at 13:02
  • @fedorqui: The output is not exactly the same. It doesn't contain the borderlines, and it prints `ccc`, too. – choroba Sep 26 '16 at 13:02
  • @choroba yep, I noticed later. The problem here is that the OP wants to print when the PAT2 occurs for the last time in a given block, so it keeps adding stuff in the buffer. Weird case. I got close adapting [this answer](http://stackoverflow.com/a/38972737/1983854) with `awk 'flag{ if (/Any--END--Pattern/){buf=buf $0 ORS; printf "%s", buf; flag=0; buf=""} else buf = buf $0 ORS}; /ANY--BEGIN--PATTERN/{flag=1; buf=$0 ORS}' file` – fedorqui Sep 26 '16 at 13:05
  • @TomFenech Thank you for your feedback about my original question, I have explained better my issue and why the other similar questions do not apply. (I will delete this comment and other of my comments later.) Cheers – oHo Sep 26 '16 at 13:17
  • 2
    becomes simpler by reversing input file, `tac ip.txt | sed -n '/Any--END--Pattern/,/ANY--BEGIN--PATTERN/p' | tac` but with unnecessary first two lines of input file... – Sundeep Sep 26 '16 at 13:21
  • @Sundeep Lovely tiny and understandable answer ;-) And I do not care about the little drawback about the extra `Any--END--Pattern`. Congratulations :-) – oHo Sep 26 '16 at 13:35
  • Hi @Sundeep I am using your trick in my document: https://github.com/olibre/CppCoding/blob/gh-pages/cpp/rules.md#fiqc--double-quotes--and-angle-brackets- I am requesting to reopen this question in order to let you provide your pretty simple answer. Thank you ;-) – oHo Sep 27 '16 at 09:26

3 Answers3

6

Perl to the rescue!

#!/usr/bin/perl
use warnings;
use strict;

my $last_end;
my @buffer;
while (<>) {
    if (/BEGIN/) {

        print @buffer[ 0 .. $last_end ] if defined $last_end;

        @buffer = $_;
        undef $last_end;
        next;
    }
    $last_end = @buffer if @buffer && /END/;
    push @buffer, $_ if @buffer;
}

@buffer accumulates the lines from BEGIN, $last_end points to, well, the last END in the buffer, so you can throw away accumulated lines that don't end in an END.

As a one-liner (but why?):

perl -ne 'defined $l && print(@B[0..$l]), (@B, $l) = $_, next if /BEGIN/; $l=@B if @B && /END/; push @B, $_ if @B' file
choroba
  • 231,213
  • 25
  • 204
  • 289
1

This should work with sed

sed '$b1;/BEGIN/{:1;x;s/\(BEGIN.*END[^\n]*\).*/\1/;t;x;h};H;d' file
123
  • 10,778
  • 2
  • 22
  • 45
1

awk to the rescue!

$ awk '/BEGIN/{c=0; b=1} 
              {a[c++]=$0} 
      b&&/END/{for(i=0;i<c;i++) print a[i]; delete a; c=0}' file

ANY--BEGIN--PATTERN   # print 1
ddd                   # print 2
Any--END--Pattern     # print 3
ANY--BEGIN--PATTERN   # print 4
ggg                   # print 5
Any--END--Pattern     # print 6
hhh                   # print 7
Any--END--Pattern     # print 8
karakfa
  • 66,216
  • 7
  • 41
  • 56
  • Much nicer than the `perl`-based answer and more understandable than the `sed`-based answer. Thanks ;-) – oHo Sep 27 '16 at 08:43