I'm having a little problem with a VB.NET scraper, it's supposed to get all links of a html string, which I have already downloaded, and the links are there (I have checked), so it must be something with my regex string.
My regex string: <a.*?href=""(.*?)"".*?>(.*?)</a>
This works for some sites, but for others it does not.
Here are examples from the HTML source that match and don't match.
Working:
<a href="http://domain.com" rel="nofollow" onmousedown="return clk('25936','3')" target="_blank">/a>
Not working:
<a href='http://domain.com' target="_blank" ><font size=2><b>text</b></a>
Could it be because of the "
and '
?