-3

I have the following html code :

<aside id="side">
    <a class="active" href="#!"> some text <a>
    <a href="#!"> some text <a>
    <p> active </p>
</aside> 

I am looking for a regex that only finds the 'active' string that is inside <aside id="side"></aside> and also 'active' should be value of class and something like <p> active </p> should not be match.

I try to use :

<aside.*[\s\S\n]class="active".*</aside>

but I dont find any match.

rioV8
  • 24,506
  • 3
  • 32
  • 49
Ahmad Reza
  • 863
  • 1
  • 7
  • 13
  • [Parsing HTML with regex is a hard job](https://stackoverflow.com/a/4234491/372239) HTML and regex are not good friends. Use a parser, it is simpler, faster and much more maintainable. – Toto Jun 06 '20 at 08:38

2 Answers2

-1

Try this

/class="\w*\W*active\W*\w*"/

Example

enter image description here

Vijay Atmin
  • 467
  • 2
  • 13
-1

The problem with your regex is that the . in .* does not catch newlines. A JavaScript regex has no modifier to have newlines included in ., but you can use [\s\S]* as a workaround. The workaround skips over all whitespace and non-whitespace.

Here is a working code snipped that demonstrates that:

var html1 = 
  '<aside id="side">\n' +
  '    <a class="active" href="#!"> some text <a>\n' +
  '    <a href="#!"> some text <a>\n' +
  '    <p> active </p>\n' +
  '</aside>';
var html2 = 
  '<bside id="side">\n' +
  '    <a class="active" href="#!"> some text <a>\n' +
  '    <a href="#!"> some text <a>\n' +
  '    <p> active </p>\n' +
  '</bside>';
var re = /<aside [\s\S]* class="active"[\s\S]*<\/aside>/;
console.log('test html1: ' + re.test(html1));
console.log('test html2: ' + re.test(html2));
Peter Thoeny
  • 7,379
  • 1
  • 10
  • 20