I am very lost with this Regex. I have a HTML Table with 3 Field:Date,Name and Place. The first record of table don't have field "Place"(i cannot change table format)... At the moment i am using pattern below:
^<tr><td.*>(.+)<\/td><td>(.+)<\/td><td><font.*>(.+)<\/font><\/td><\/tr> $\n<tr><td.*>(.+)<\/td><\/tr>
This pattern ignores the first record of table(this record don't have field "Place"). I don't want create 2 Pattern for same text. Can anyone help with this issue?
A sample of table:
<table border cellpadding=1 hspace=10>
<colgroup style='font:8pt Tahoma;color=Black' valign=top><colgroup style='font:8pt Tahoma; color=Navy'><colgroup style='font:8pt Tahoma;color=Maroon'>
<tr>
<td><font FACE=Tahoma color='#CC0000' size=2><b>Date</b></font></td>
<td><font FACE=Tahoma color='#CC0000' size=2><b>Name</b></font></td>
<td><font FACE=Tahoma color='#CC0000' size=2><b>Place</b></font></td>
</tr>
<tr><td rowspan=2>17/08/2011 10:28</td><td>Vivamus sed est ut lorem tempor cursus</td><td><FONT COLOR="000000">Curabitur egestas metus bibendum</font></td></tr>
<tr><td colspan=2>Curabitur id urna elit</td></tr>
<tr><td rowspan=2>17/08/2011 10:26</td><td>UDonec blandit nisl ut nisl elementum</td><td><FONT COLOR="000000"> hendrerit vel ante</font></td></tr>
<tr><td colspan=2>Etiam nec mollis</td></tr>
<tr><td rowspan=2>12/08/2011 09:46</td><td>Nulla et eros a massa</td><td><FONT COLOR="000000">Aenean in mauris eget tellus </font></td></tr>
<tr><td colspan=2>Nulla et eros a massa tristique blandit </td></tr>
<tr><td rowspan=2>12/08/2011 09:45</td><td>orta mi dapibus sit amet. Vestib</td><td><FONT COLOR="000000"> mollis erat consectetur.</font></td></tr>
<tr><td colspan=2>sodales tempor</td></tr>
<tr><td rowspan=1>11/08/2011 10:39</td><td>lorem ipsum</td><td><FONT COLOR="000000">dolor</font></td></tr>
</TABLE>
The current solution is create 2 regexp. The first regex catch table without first record:
^<tr><td.*>(.+)<\/td><td>(.+)<\/td><td><font.*>(.+)<\/font><\/td><\/tr> $\n<tr><td.*>(.+)<\/td><\/tr>
And the second regex capture first record:
^<tr><td.*>(.+)<\/td><td>(.+)<\/td><td><font.*>(.+)<\/font><\/td><\/tr> $