-5

I have a html string to be parsed. ResultsString

         <table id="Table1">
            <tr>
              <td width="50%">
                 Result: <span style="font-weight:bold; color:GREEN;"></span>
               </td>
               <td width="50%">
                  ID: <span style="font-weight:bold;">790043</span>
               </td>
           </table>
         <table id="Table2">
            <tr>
              <td class="name">
                Status:
             </td>
             <td class="value">
                None
             </td>
             </tr>

        </table>
<br /><br />
<a href="#" onclick="$('#vvvv').toggle();return false;" /></a>
<br />
<div id="pp1" style="displa
</div>

How would I extract/substring only the text in the two table tags. So my resuting html string would be

   <table id="Table1">
            <tr>
              <td width="50%">
                 Result: <span style="font-weight:bold; color:GREEN;"></span>
               </td>
               <td width="50%">
                  ID: <span style="font-weight:bold;">790043</span>
               </td>
           </table>
         <table id="Table2">
            <tr>
              <td class="name">
                Status:
             </td>
             <td class="value">
                None
             </td>
             </tr>

        </table>

Please suggest

Thank u

user575219
  • 2,346
  • 15
  • 54
  • 105

2 Answers2

0

You want to transform an HTML file? That's an XSLT job.

joce
  • 9,624
  • 19
  • 56
  • 74
0

As suggested, you should use an HTML parser such as the HTML Agility Pack. Otherwise, you may run into problems if you have nested structures, etc.

For this simple case though, you can use this regular expression:

string html = Regex.Match(ResultsString,
                          @"<table.+<\/table>",
                          RegexOptions.Singleline).Value;

But again, only if your input string is as simple as you showed us!

Julián Urbano
  • 8,378
  • 1
  • 30
  • 52
  • Please do not tell beginners to use Regex to parse HTML; it is never appropriate. If the HTML truly is as simple as claimed, then `String.Substring` is adequate. If this is not adequate, then neither is a Regex. – Dour High Arch Apr 03 '13 at 20:02
  • a) I do explicitly recommend to use a parser instead. b) so when it really is that simple, `Substring` is ok and `Regex` is not? Give me a break – Julián Urbano Apr 03 '13 at 21:07