0

i want to parse complete HTML children and its children wherein i will not have any attribute id placed to tag.

For eg:

<html>
 <head>
  <script>
     function blah(){
        alert("hi");
     }
   </script>
  <style>
     body{
         font:10px;
     }
  </style>
 </head>
 <body>
   <h1> My Header </h1>
   <div class="container">
       <div class="colone">Hai22</div>
       <div class="coltwo">Hai44</div>
   </div>
 </body>
</html>

Now i would like to parse the whole html and get its children one by one and convert it into JSON string. like

{
  "html":{
       "head":{
               "script":  
            .
            .
            .
            .
            .
            .
            .
}
pathfinder
  • 127
  • 4
  • 18

1 Answers1

0

This is not possible, because the HTML (or XML-like) tree has different limitations to the Javascript/JSON object model. Specifically, every 'child' tag must be unique within the parent. This is not valid JSON:

"section": {
    "div": { ... },
    "div": { ... },
    "div": { ... }
}

You cannot have three attributes of an object called "div". In the end you have to store lists of objects, something like:

{ 
    "tagname": "section",
    "children": [
        { "tagname": "div",
          "children": ... }
     ...
     ]
}

Once you get to that point the conversion is pretty much pointless. Use a standard DOM parsing library in your favourite programming language.

Tom Rees
  • 681
  • 4
  • 17
  • Thanks for the reply @Tom, i will append the div id to the key value in json... i need to convert through javascript jquery... is it possible? – pathfinder Nov 23 '14 at 09:34