I have to match a large amount of records in HTML. I want each record matched with a regular expression (using .NET Regex Match).
Each record is formatted like this (the total HTML contains of normal HTML and ~100 records like the following):
<tr onclick="window.location.href='Vareauktion.asp?VISSER=Ja&funk=detaljedata&ID=14457'" style="cursor:hand" onmouseover="bgColor='#808080'" onmouseout="bgColor='#4b4b4b'" bgcolor="#4b4b4b">
<td valign="top">
<div id='OrdreID14457'></div>
<script>RunTimer('OrdreID14457', '04-10-2010 14:30:22');</script>
<em><font size="-1">04-10-2010 14:30:22</font></em></td>
<td valign="top"> Voldby (28|0)</td>
<td valign="top">02:16:00</td>
<td valign="top">09-10-2010<br>15:30:22</td>
<td valign="top">Modeltog <img src="images/Gods_Modeltog.gif" alt="Modeltog" height="15" border="0"></td>
<td valign="top">6603 T.</td>
<td valign="top">
<img src='images/moneter.gif' height='13' alt='Moneter'>5.751.213
</td>
<td valign="top">
</td>
<td valign="top">
</td>
</tr>
I've tried the following so far:
Regex:
id='OrdreID.*[^(<td colspan="9" height="1" bgcolor="#000000">)]*<td colspan="9" height="1" bgcolor="#000000">
What I am trying to do is the following:
- Start my match at: id='OrdreID
- Accept everything afterwards, UNTIL it sees:
<td colspan="9" osv..
- Then at last, I match the final:
With my current solution, I have the problem that the exclude pattern only matches chars, NOT strings..
I have been reading about "lookingahead", but I have no idea how to use it in this situation..
Thanks a lot!! Best regards, Lars