I have been trying to write a regular expression to validate a file to make sure that it follows a specific format. The file should have a version();
line and then be followed by one or more element();
blocks.
Here is an example of a valid file:
version(1.0);
element
(
);
element
(
);
element
(
);
As a test I created the following Perl example:
use strict;
use warnings;
my $text = <<'END_TEXT';
version(1.0);
element
(
);
garbage <--- THIS SHOULD NOT MATCH!
element
(
);
element
(
);
END_TEXT
my $rx_defs = qr{(?(DEFINE)
(?<valid_text>
\A\s*(?&version)\s*
(?: (?&element) \s* )+
\s*\Z
)
(?<version>
version\(.+?\);
)
(?<element>
element\s*
(?&element_body);
)
(?<element_body>
\( (?: [^()]++ | (?&element_body) )* \)
)
)}xms;
if ($text =~ m/(?&valid_text)$rx_defs/) {
print "match";
}
As you can see, there is a line of "garbage" in the text that should make it not valid, but for some reason Perl still seems to think that this text is valid! When I run this code it produces the output:
match
I have spent hours trying to track down what is wrong with my regular expression, but I just don't see it. I even tested the exact regex using an online regular expression tester and according to the test my regular expression should be working fine! (Try removing the line of "garbage" if you want to see that it does match correctly when the format is valid.)
This has had me thoroughly stumped all day and is making me wonder whether there is a bug in the Perl regular expression engine itself. Can somebody please tell me why this is matching when it shouldn't?
I am using perl v5.20.1