1

I have HTML content that I need to strip out. Some of this content sits between comments as in:

<div>Some content</div>
<!-- Begin: Modal View -->
    <div class="foo">
        <div> Modal View</div>
        <div>...</div>
    </div>
<!-- End: Modal view -->
<div>Some other content</div>

I'd like to select the contents starting from and ending with with a Regular Expression - that is including the comments. Once removed, the output would be:

<div>Some content</div>
<div>Some other content</div>

All I need is the regEx. No need to illustrate how to get the HTML. Assume that the HTML is a String Object with .replaceMethod.

Thanks!

snazzyHawk
  • 45
  • 1
  • 7
  • 1
    [Don't parse HTML with RegEx](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) – antyrat Feb 26 '15 at 19:58
  • just be aware that things will go horribly wrong when there's something like: `var s = '';` in between. – Bart Kiers Feb 26 '15 at 20:06

3 Answers3

1

If you know how the comments are, then this should do.

s/<!-- Begin: Modal View -->.*?<!-- End: Modal view -->//sg

Notice the interrogation mark. That's for the match to be non-greedy.

Otherwise the match would go on from the very first open comment to the very last closing comment, stripping all in the middle.

Random42
  • 11
  • 2
-1

Use the following regular expression:

/<!--\s*begin:\s*(.+)\s*-->[\s\S]*<!--\s*end:\s*\1\s*-->/gi

This will select everything from (and including) <!-- begin: label --> to (and including) <!-- end: label -->.

To remove those occurrences simply do:

var pattern =     /<!--\s*begin:\s*(.+)\s*-->[\s\S]*<!--\s*end:\s*\1\s*-->/gi;
var input= ".....";
var stripped = input.replace(pattern, "");
Mårten Wikström
  • 11,074
  • 5
  • 47
  • 87
-1

You could parse HTML for finding the comments. For that,s see: http://www.bennadel.com/blog/2607-finding-html-comment-nodes-in-the-dom-using-treewalker.htm

Otherwise, if you are not looking for a perfectly reliable and maintainable solution:

var my_html= document.querySelector("#my_html");

my_html.innerHTML= my_html.innerHTML.replace(/<\!--\s*Begin\:(.*?)\s*-->[\s\S]*<\!--\s*End\:\1\s*-->/gmi,"")
<div id="my_html"> 
  
  <div>Some content</div>
<!-- Begin: Modal View -->
    <div class="foo">
        <div> Modal View</div>
        <div>...</div>
    </div>
<!-- End: Modal view -->
<div>Some other content</div>

</div>
Gaël Barbin
  • 3,769
  • 3
  • 25
  • 52