$string2 = '<tag id="123">123</tag>';
$string2 =~ s/123(?![^><]*>)/456/cg;
I need an explanation about negative lookbehind pattern (?![^><]*>)
in the above code.
$string2 = '<tag id="123">123</tag>';
$string2 =~ s/123(?![^><]*>)/456/cg;
I need an explanation about negative lookbehind pattern (?![^><]*>)
in the above code.
This is a negative lookahead, not a lookbehind. It asserts that the following characters do not match the pattern inside the lookahead.
In this case, (?![^><]*>)
means "only match 123 if it is not followed by >
, optionally with other characters except >
or <
in between".
So this regex will match:
123
alone123
followed by other characters except >
123
followed by <
but not >
But it will NOT match 123
followed by >
, even with other characters in between. So for example:
123a
-> will match and replace123<
-> will match and replace123b>
-> will NOT matchThe [^><]*
part matches any number of characters except >
or <
.
The >
part then asserts that the following character must NOT be >
, otherwise it doesn't match.
The code you have is trying to replace the text inside a tag without interfering with the tag itself. There are better ways to do this, and I typically reach for Mojo::DOM:
use v5.10;
use Mojo::DOM;
my $dom = Mojo::DOM->new('<tag id="123">123</tag>');
$dom->at( 'tag' )->child_nodes->[0]->replace( '456' );
say $dom;
This way, you don't have to think about any of the complexity of HTML or XML when you want to modify it. See https://stackoverflow.com/a/4234491/2766176 for fun.