Parsing HTML table inside XML document using .net

Question

I am not familiar with various XML formatting I have recieved XML document most part of the document looks OK (I can convert to a dataset in .net using XMLReader etc.) however, I see HTML table inside that document?

for brevity I am posting sample html:

<table>
 <tr>
  <td>1</td>
  <td>A1</td>
  <td>100</td>
  <td>0</td>
  <td>0</td>
  <td>0</td>
  <td>57.3055058694109</td>
  <td>-34.25451779412</td>
  <td>-52.0038336686283</td>
  <td>58.2700128150308</td>
  <td>-27.6125327409403</td>
  <td>-34.0354177282971</td>
  <td>5.62354809254242</td>
  <td>-0.964506945619888</td>
  <td>-6.6419850531797</td>
  <td>-17.9684159403313</td>
  <td>5.17635156249084</td>
  <td>18.4441134607471</td>
  <td>0.984914387144844</td>
 </tr>
</table>

How to parse table using .net (VB.net or C#)

You can add HTML code by selecting it and clicking the Format Code button in the toolbar, which will indent it with four spaces. — SLaks, Sep 28 '10 at 14:03

score 0 · Accepted Answer · edited May 23 '17 at 10:33

0

Assuming that the table is valid XHTML, you can parse it using the XElement class.
If it isn't, you can parse it using the HTML Agility Pack.

Under no circumstances should you attempt to parse it using regular expressions.

edited May 23 '17 at 10:33

Community

1
1

answered Sep 28 '10 at 14:01

SLaks

868,454
176
1,908
1,964

While you can't parse HTML in general with Regex, you can parse specific forms of HTML using regex if you know what your input is. If the above HTML was machine generated, and always took a form similar to shown above, there's no reason that regex couldn't be used, because the goal isn't to parse an entire arbitrary document, but rather parse a small subset of the HTML language. If the above example had replaced with ##, replaced with !! and
was replaced with @@ such that it wasn't HTML anymore, would regex be a proper solution?
– Kibbee Sep 28 '10 at 14:19
Hi! Just curious do I need XSL style sheet so I can tranform ? – cshah Sep 28 '10 at 14:34

Parsing HTML table inside XML document using .net

1 Answers1