2
$string2 = '<tag id="123">123</tag>';

$string2 =~ s/123(?![^><]*>)/456/cg;

I need an explanation about negative lookbehind pattern (?![^><]*>) in the above code.

toolic
  • 57,801
  • 17
  • 75
  • 117

2 Answers2

6

This is a negative lookahead, not a lookbehind. It asserts that the following characters do not match the pattern inside the lookahead.

In this case, (?![^><]*>) means "only match 123 if it is not followed by >, optionally with other characters except > or < in between".

So this regex will match:

  • 123 alone
  • 123 followed by other characters except >
  • 123 followed by < but not >

But it will NOT match 123 followed by >, even with other characters in between. So for example:

  • 123a -> will match and replace
  • 123< -> will match and replace
  • 123b> -> will NOT match

The [^><]* part matches any number of characters except > or <. The > part then asserts that the following character must NOT be >, otherwise it doesn't match.

brian d foy
  • 129,424
  • 31
  • 207
  • 592
SWARNAVA DUTTA
  • 163
  • 1
  • 3
1

The code you have is trying to replace the text inside a tag without interfering with the tag itself. There are better ways to do this, and I typically reach for Mojo::DOM:

use v5.10;
use Mojo::DOM;

my $dom = Mojo::DOM->new('<tag id="123">123</tag>');
$dom->at( 'tag' )->child_nodes->[0]->replace( '456' );

say $dom;

This way, you don't have to think about any of the complexity of HTML or XML when you want to modify it. See https://stackoverflow.com/a/4234491/2766176 for fun.

brian d foy
  • 129,424
  • 31
  • 207
  • 592