-1

I want to replace the characters < and > from the text. I have a regex as below:

(<span[^>]+class\s*=\s*("|')subValue\2[^>]*>)[^<]*(<\/span>)|(<br(\/*)>)

It is to target <br/>, <br>, <span class="subValue">......</span>. And I want to replace the < and > with &lt; and &gt;.

When I wrapped it with a big bracket, it doesn't select out the < and > that from the <span> or <br>. Instead, it selected all < and >.

(<|>)(?!(<span[^>]+class\s*=\s*("|')subValue\2[^>]*>)[^<]*(<\/span>)|(<br(\/*)>))

What is wrong with the regex?

I have created a sample here.

Code snippet sample.

var str = '-<br><span class="subValue">Value Here<br/>';
regex = new RegExp('(?<=span|br)(<|>)|(<|>)(?=span|br)|(?<="subValue"|\'subValue\')>|<(?=\/)|(?<=br\/)[\s]*>', 'gi');
//str = str.match(regex);
str = str.replace(regex, 'Testing');
$('#lol').html(str);
<div id="result" style="border:1px solid red;"></div>
Chin
  • 593
  • 4
  • 15
  • 36
  • 4
    Hey look.. This is my favorite post http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – choz Dec 03 '15 at 06:33
  • @choz, if can, I won't do this as well. Currently I'm getting a whole string of HTML returned from external service, and just been told that these HTML should be normal text, shouldn't be part of HTML. So I have no choice to use regex to do this. – Chin Dec 03 '15 at 06:43
  • Actually, I don't really get your question. The sample you provide seemed to match all `<` and `>`. Yet you said that It doesn't select them in your question. – choz Dec 03 '15 at 06:49
  • What I means is that it doesn't select out the `<` and `>` from the `` and `
    `, instead, it selected out all `<` and `>` from the overall input.
    – Chin Dec 03 '15 at 06:50
  • Will this do? `(?<=span|br)(<|>)|(<|>)(?=span|br)|(?<="|')>|<(?=\/)` – choz Dec 03 '15 at 07:10
  • Yes, it does most of the work. Thank you for your fast response, and your favorite post. – Chin Dec 03 '15 at 07:22
  • Any reason for the down vote? Please leave some comments as well. – Chin Dec 03 '15 at 09:59

2 Answers2

0

I will just add this to the answer.

(?<=span|br)(<|>)|(<|>)(?=span|br)|(?<="|')>|<(?=\/)

This will match;

  • All < or > that happens after span or br
  • All < or > that happens before span or br
  • All > that happens after ' or " (quote or double-quotes)
  • All < that happens before /
choz
  • 17,242
  • 4
  • 53
  • 73
  • 1
    Oh look, even StackOverflow treats everything after `"` as a string. – choz Dec 03 '15 at 07:41
  • Seems like need to convert your regex to be working for JavaScript, as lookbehinds not working in JavaScript. – Chin Dec 03 '15 at 08:28
  • That's a javascript regex, and I've tested it on the sample link you gave. In which case it doesn't work? Perhaps I can be some of help. – choz Dec 03 '15 at 08:32
  • I have added the sample into my post. I'm getting `invalid regexp group` error, and I think it is from the `lookbehinds`. – Chin Dec 03 '15 at 08:39
  • My bad. I chosen the PHP tab in the regex101. Sorry. – Chin Dec 03 '15 at 09:57
  • @Chin Oh, I didn't realize that as well.. Yes, the lookbehinds aren't working in this answer. I will try to play with it again later and see if I can be of any help. You shouldn't mark this as an answer then. :) – choz Dec 03 '15 at 10:05
0

(<|>)(?!(<span… What is wrong with the regex?

You not just wrapped it with a big bracket, rather you put a pattern of <|> in front, followed by a negative lookahead which never matches, particularly because it looks for another <.


Here I used your original regex unchanged, doing the wanted replacements with a replace function:

text = '\
<span class="subValue"></span>\
<br>\
<br>\
<span class=\'subValue\'>dsa</span>\
<>';
regex = /(<span[^>]+class\s*=\s*("|')subValue\2[^>]*>)[^<]*(<\/span>)|(<br(\/*)>)/g;
function replacer(match) { return match.replace(/</g, '&lt;').replace(/>/g, '&gt;'); }
text = text.replace(regex, replacer);
// display the result
document.body.appendChild(document.createTextNode(text))
Armali
  • 18,255
  • 14
  • 57
  • 171