I have a .Net app using regular expressions to extract information out of some html. The html is not XML compliant, so I can't parse it using XDoc. Here is a small piece of the html that I'm having problems with:
<td class="program">
<div>
<h2>
The O'Reilly Factor
</h2>
</div>
</td>
<td class="program">
<div>
<span class="font-icon-new">New</span>
<h2>
The Kelly File
</h2>
</div>
</td>
The regular expression I'm using is:
(<td class="program">.*?(?<isnew>font-icon-new)?.*</td>)+
What I'm expecting in this scenario is two captured groups. The first group's "isnew" group would be blank (a non-hit), but the second group's "isnew" group would be populated. However, the "isnew" group is always blank, and I've tried multiple variations and simplified it down as much as possible to no avail. I'm also using the RegexOptions.Singleline option to ensure the "." also matches newline characters. Any ideas on what I'm missing?
Thanks in advance.