3

I am writing a perl program to extract lines that are in between the two patterns i am matching. for example the below text file has 6 lines. I am matching load balancer and end. I want to get the 4 lines that are in between.

**load balancer** 
new 
old
good
bad
**end**

My question is how do you extract lines in between load balancer and end into an array. Any help is greatly appreciated.

Axeman
  • 29,660
  • 2
  • 47
  • 102

4 Answers4

7

You can use the flip-flop operator to tell you when you are between the markers. It will also include the actual markers, so you'll need to except them from the data collection.

Note that this will mash together all the records if you have several, so if you do you need to store and reset @array somehow.

use strict;
use warnings;

my @array;
while (<DATA>) {
    if (/^load balancer$/ .. /^end$/) {
        push @array, $_ unless /^(load balancer|end)$/;
    }
}

print @array;

__DATA__
load balancer
new 
old
good
bad
end
TLP
  • 66,756
  • 10
  • 92
  • 149
  • Sweet, didnt know you can use the flip flip op like this. – snoofkin Dec 08 '11 at 19:55
  • @abhirampotluri You're welcome. If you feel this answered your question, click the check mark (to the left) to mark it as correct. – TLP Dec 08 '11 at 21:44
  • @abhirampotluri Is there a reason you chose not to accept my answer, and instead chose a lower rated answer that came in after mine, which does not save the result into an array? I noted that you first accepted mine, then changed your mind. – TLP Dec 10 '11 at 01:22
  • @Miller You cant just rewrite my entire answer to suit a duplicate pointing to this question. If its not the same answer, then its not the same question. And I don't think that was really simplifying logic, rather the opposite. Maybe you meant making the code more generic. – TLP Aug 13 '14 at 23:55
2

You can use the flip-flop operator.

Additionally, you can also use the return value of the flipflop to filter out the boundary lines. The return value is a sequence number (starting with 1) and the last number has the string E0 appended to it.

# Define the marker regexes separately, cuz they're ugly and it's easier
# to read them outside the logic of the loop.
my $start_marker = qr{^ \s* \*\*load \s balancer\*\* \s* $}x;
my $end_marker   = qr{^ \s* \*\*end\*\* \s* $}x;

while( <DATA> ) {
    # False until the first regex is true.
    # Then it's true until the second regex is true.
    next unless my $range = /$start_marker/ .. /$end_marker/;

    # Flip-flop likes to work with $_, but it's bad form to
    # continue to use $_
    my $line = $_;

    print $line if $range !~ /^1$|E/;
}

__END__
foo
bar
**load balancer** 
new 
old
good
bad
**end**
baz
biff

Outputs:

new 
old
good
bad
Miller
  • 34,962
  • 4
  • 39
  • 60
Schwern
  • 153,029
  • 25
  • 195
  • 336
0

If you prefer a command line variation:

perl -ne 'print if m{\*load balancer\*}..m{\*end\*} and !m{\*load|\*end}' file
JRFerguson
  • 7,426
  • 2
  • 32
  • 36
0

For files like this, I often use a change in the Record Separator ( $/ or $RS from English )

use English qw<$RS>;
local $RS = "\nend\n";

my $record = <$open_handle>;

When you chomp it, you get rid of that line.

chomp( $record );
Axeman
  • 29,660
  • 2
  • 47
  • 102