2

I am trying to perform a global substitution in perl on a string on a basis of a certain pattern NOT matching before or after a certain match. Basically, I have an xml tag and want to keep it if a match occurs within ten characters before or after the tag, but remove the xml tag if not.

so, if I have a string which contains:

foo something<xml tag>bar<\xml tag> something

No substitution will occur, but if a string is

something <xml tag>bar<\xml tag> something

it would be replaced with:

something bar something

What I tried is:

$string =~ s/(?<!foo.{0,10})<xml tag>(bar)<\/xml tag> |<xml tag>(bar)<\/xml tag>(?!.{0,10}foo)/$1/g;

But I got this error:

Variable length lookbehind not implemented in regex

I'm not really sure how to do this. Help?

aa762
  • 29
  • 2
  • 3
    Look behinds in regex have to be fixed length: http://stackoverflow.com/questions/3796436/whats-the-technical-reason-for-lookbehind-assertion-must-be-fixed-length-in-r – Martyn Oct 30 '13 at 10:36

2 Answers2

0

From perlretut:

Lookahead "(?=regexp)" can match arbitrary regexps, but lookbehind "(?<=fixed-regexp)" only works for regexps of fixed width, i.e., a fixed number of characters long. Thus "(?<=(ab|bc))" is fine, but "(?<=(ab)*)" is not.

So if the word(s) have fixed length before <xml tag>bar<\xml tag> you should use it otherwise you may use more than one regexps for example.

edem
  • 3,222
  • 3
  • 19
  • 45
0

One way using the e flag:

while (<DATA>) {
    s/((.{0,13})<xml\ tag>([^<]*)<\/xml\ tag>)(?!.{0,10}foo)/
    index($2,'foo') > -1 ? "$1" : "$2$3"/xe;
    print $_; 
}

__DATA__
foo something<xml tag>bar</xml tag> something
something <xml tag>bar</xml tag> something

Produces:

foo something<xml tag>bar</xml tag> something
something bar something
perreal
  • 94,503
  • 21
  • 155
  • 181