I'm attempting to parse an html file and i have a regular expression that captures patterns inside all p tags. for some reason it's only printing out the first instance found.
my @newH2Array = ("Part I", "Part II", "Part III");
my $linenumber = 0;
while (my $line = <$parser>){
chomp $line;
$linenumber++;
if($line =~ /^<p>/){
if($line =~ /(Part [IVX]+)/gi) {
if (grep{ lc $_ eq lc $1 } @newH2Array){
print "found a hit <" . $1 . "> that matches array element on line" . $linenumber;
}
}
}
}
When i run it with this test below it would only print out Part I but not the other 3. When i switched the if statements to a while loop it doesn't work as either. Can anyone tell me what i'm doing wrong here?
<p>Part I should be found. Part II should be found also. Part III should be found.</p>
The result should be.
found a hit <Part I> that matches array element on line 1
found a hit <Part II> that matches array element on line 1
found a hit <Part III> that matches array element on line 1