I have so far this:
<a href="(http://www.imdb.com/title/tt\d{7}/)".*?>.*?</a>
c#
ArrayList imdbUrls = matchAll(@"<a href=""(http://www.imdb.com/title/tt\d{7}/)"".*?>.*?</a>", html);
private ArrayList matchAll(string regex, string html, int i = 0)
{
ArrayList list = new ArrayList();
foreach (Match m in new Regex(regex, RegexOptions.Multiline).Matches(html))
list.Add(m.Groups[i].Value.Trim());
return list;
}
I'm trying to extract imdb link from an HTML page what is wrong with this regex expression?
The main idea of this is to search in google for a movie and then look for a link to imdb in the results