0

I Have string var:

Some text...<div class=example><pre><ul><li>Item</li></ul></pre><div class=showExample></div></div>Some text...

I want replace all < and > chars in pre tag to html entity = &lt; and &gt; So i wrote this Script:

text = text.replace(new RegExp("(?=(<pre>.*))<(?=(.*</pre>))","ig"),"&lt;");
text = text.replace(new RegExp("(?=(<pre>.*))>(?=(.*</pre>))","ig"),"&gt;");

I always get this result:

<p>Some text...<div class=example>&lt;pre><ul><li>Item</li></ul></pre><div class=showExample></div></div>Some text...</p>

Why???

stepanVich
  • 318
  • 4
  • 11
  • possible duplicate of [RegEx match open tags except XHTML self-contained tags](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – Cole Tobin Aug 10 '13 at 19:16
  • @Cole"Cole9"Johnson I don't think that it's a possible duplicate. The is a valid one about the behaviour of regex. OP is not parsing the the html with regex, which is different. It could have been on another text other than html. – Jerry Aug 10 '13 at 19:52
  • @Jerry he _is_ using regex to parse HTML. He's using regex to find the pre tags. Just use a DOM parser – Cole Tobin Aug 10 '13 at 19:54
  • @Cole"Cole9"Johnson The question isn't itself about parsing the html, but about _why_ the behaviour of the regex was such. There is a difference there. – Jerry Aug 10 '13 at 19:58
  • @Jerry yes. But he still shouldn't be doing it. – Cole Tobin Aug 10 '13 at 19:59
  • @Cole"Cole9"Johnson That's not a reason to disregard a learning opportunity. – Jerry Aug 10 '13 at 19:59

2 Answers2

1

It's because of your first lookahead: (?=(<pre>.*)). When the cursor of the regex is right before <pre>, it matches since you have a < and there is <pre> ahead.

You probably intended to have a lookbehind there (?<= ... ) instead, but javascript doesn't support them.

I'm not familiar with JS, but it might be easier to first extract the stuff within the <pre> tags:

match = text.match(/<pre>(.*?)<\/pre>/)[1];

Then replace all you need to replace in this little group:

match = match.replace(/</g, '&lt;').replace(/>/g, '&gt;');

Then replace it back into the original:

text = text.replace(/<pre>.*?<\/pre>/g, '<pre>'+match+'</pre>');

As said before, I'm not familiar with JS, but I guess you can run a loop to replace multiple texts within those <pre> tags.

For your example, here's a fiddle.

Jerry
  • 70,495
  • 13
  • 100
  • 144
-1

Maybe you're better off using jQuery to html encode/decode, then using Regex, since it would break on more complex markup.

You have example here.

Community
  • 1
  • 1
Nenad
  • 24,809
  • 11
  • 75
  • 93