1

I am reading an XML file in which I am replacing certain words. I don't want to replace words that come in XML elements or attributes. I'm using negative look behind and look ahead and I'm 90% there. However, if a value comes in an attribute that contains the string which I want to replace then it is getting replaced. I don't want that.

My regex:

-$fi=~s/(?<!<|\/)(?<!value=")\bis\b(?!=)/iss/g

This regex does not match elements attributes and value attributes. But if there is a text like value="hello there" and I am replacing there with some word then it is getting replaced. I want a regex that will search if the word there is not preceded at the start of value=".

  • 2
    https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 You'd be much better off using an XML parser, and then fiddling with the data – sniperd Sep 22 '22 at 15:06
  • Yes it's a project requirement and this text iss getting replaced in perl so i want a negative lookbehind to check if text has value=" before and if yes then skip it. – Madsinceborn Sep 22 '22 at 15:19
  • could you provide a input example and it's expected matches? – Ricardo Valente Sep 22 '22 at 15:24
  • 1
    Ex:- like if the xml is like There So is should replace on standalone value There and not the text which is present in attribute value. – Madsinceborn Sep 22 '22 at 15:34
  • 3
    You are setting yourself an impossibly hard task if you are both trying to parse XML and exclude quoted strings using regex. I would never even bother attempting such an excruciating task when the alternative of just using an XML parser is so much simpler. It is possible to use regex for a small, well-known set of XML, but you would risk breaking your data. – TLP Sep 22 '22 at 21:22

0 Answers0