I am looking for some information on how to convert XHTML to a very specific XML. For example, I have following XHTML sample:
<body>
<div id="divParent" class="header" style="width: 250px; height: 200px;">
<fieldset id="fldScope" style="left: 5px; width: 240px; top: 5px; height: 60px;">
<label style="left: 5px; top: 5px;">Reason:</label>
<select id="selReason">
<option value="">SELECT ONE:</option>
<option value="TRAINING">TRAINING</option>
<option value="OTHER">OTHER</option>
</select>
</fieldset>
<fieldset class="bottomSection">
<button id="btnClose" accessKey="o" class="webbutton" type="button">
<u>O</u>K</button>
</fieldset>
</div>
</body>
which I need to transform into something like this:
<control controlId="topLevelDiv" controlType="HtmlDiv" controlSearchProperties="id=divParent;class=header">
<childControls>
<control controlId="topLevelFieldset" controlType="HtmlFieldSet" controlSearchProperties="id=fldScope">
<childControls>
<control controlId="topLevelLabel" controlType="HtmlLabel" controlSearchProperties="InnerText=Reason:">
<childControls/>
</control>
<control controlId="topLevelComboBox" controlType="HtmlComboBox" controlSearchProperties="Id=selReason">
<childControls>
<control controlId="defaultOption" controlType="HtmlListItem" controlSearchProperties="InnerText=SELECT ONE">
<childControls/>
</control>
<control controlId="option1" controlType="HtmlListItem" controlSearchProperties="InnerText=TRAINING">
<childControls/>
</control>
<control controlId="option2" controlType="HtmlListItem" controlSearchProperties="InnerText=Other">
<childControls/>
</control>
</childControls>
</control>
<control controlId="bottomFieldset" controlType="HtmlFieldSet" controlSearchProperties="class=bottomSection">
<childControls>
<control controlId="okButton" controlType="HtmlButton" controlSearchProperties="Id=btnClose; acessKey=o; type=button" >
<childControls></childControls>
</control>
</childControls>
</control>
</childControls>
</control>
</childControls>
</control>
I have all the mapping on how to map various control to different controltypes. But when I try to load the XHTML as XDocument (in order to extract attributes and elements), I get parsing error.
I thought of regular expression and basic string manipulation, but that might get too hard to manage, especially when trying to cover all edge cases.
I am not sure, what would be best way to approach this. Please help!!
Thanks in advance.