10

I have the following HTML:

<!--
<option value="HVAC">HVAC</option>
<option value="Cooling">|---Cooling</option>
<option value="Heating">|---Heating</option>
-->
....

I fetch this file dynamically using jQuery's get method and store it in a string variable named load_types.

How can I strip the HTML comment tags and everything outside of them? I only want the inside HTML:

<option value="HVAC">HVAC</option>
<option value="Cooling">|---Cooling</option>
<option value="Heating">|---Heating</option>

I tried to use the solutions here but nothing worked properly--I just get null as a match.

Thanks for the help!

Community
  • 1
  • 1
hao_maike
  • 2,929
  • 5
  • 26
  • 31

2 Answers2

17

Please never use regex to parse HTML. You can use the following instead:

var div = $("<div>").html(load_types),
    comment = div.contents().filter(function() {
        return this.nodeType === 8;
    }).get(0);

console.log(comment.nodeValue);

DEMO: http://jsfiddle.net/HHtW7/

Community
  • 1
  • 1
VisioN
  • 143,310
  • 32
  • 282
  • 281
  • 5
    Native JavaScript solution: http://jsfiddle.net/4g3FT/ , Native JS solution assuming ES5 http://jsfiddle.net/TUR65/ – Benjamin Gruenbaum May 25 '13 at 20:19
  • @BenjaminGruenbaum Yeah, I was just writing it. Thanks! – VisioN May 25 '13 at 20:20
  • @BenjaminGruenbaum No, that's more then fine. jQuery was far enough from me `;)` – VisioN May 25 '13 at 22:35
  • @VisioN You're right about *don't parse HTML using regexs*. But OP was looking to get the text between . This is very *regular*. Once you get the text between the whole characters, you can parse it using DOM. In this case, regexs > DOM, because you don't need to iterate over the document for just extracting text. – Matías Fidemraizer May 26 '13 at 08:39
  • 1
    @MatíasFidemraizer Do I really need to post a really complicated counter example, or is something trivial like ` world"; --->` be enough to convince you that a regex is a bad tool for this sort of thing? – Benjamin Gruenbaum May 26 '13 at 19:41
  • @MatíasFidemraizer `">` – John Dvorak May 26 '13 at 19:44
  • 1
    @MatíasFidemraizer that thing may be regular, but the context in which to find it is certainly not – PeeHaa May 26 '13 at 19:44
  • 2
    You're going to the edge case. OP case is regular, very very very very very regular. OP case wasn't "I want to parse any HTML" but just the HTML shown in the sample code. Is this regular? It is very regular! I know that you won't create a full HTML parser using regexs, but regexs could be enough for the whole case. – Matías Fidemraizer May 27 '13 at 06:16
  • 2
    @MatíasFidemraizer Regular expressions are an extremely poor choice to extract a string out of a comment, if OP's string changes even the slightest the regular expression you suggested earlier can break. While it surely is regular (just imagine the automaton) to match a comment in a string as in OPs, in practice it's an extremely poor idea, especially since this ability is already built into the browser and any slightly-competent JS code who knows DOM 101 can do so effortlessly. – Benjamin Gruenbaum May 27 '13 at 19:32
  • NB: These answers do not work recursively, only nodes remaining directly in the queried node will be found. – lmeurs Jan 17 '15 at 10:18
0

You can simply get the html of the parent tag where the comment is and do a .replace("<!--","").replace("-->", "") which will simply remove the comment tags and then append this markup to some other parent or replace your current markup or create a new parent for it and append it.

This will allow you to use the jQuery selectors to retrieve the required data.

var comment = '<!-- <option value="HVAC">HVAC</option> <option value="Cooling">|---Cooling</option> <option value="Heating">|---Heating</option> --> ';

jQuery("#juni").append("<select>"+comment.replace("<!--", "").replace("-->", "") + "</select>")
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<div id="juni"></div>
Junaid Anwar
  • 844
  • 1
  • 9
  • 22