I'm working on a movie scraper / auto-downloader that iterates over my current movie collection, finds new recommendations, and downloads the new goods.
There is a part where I scrape IMDb for metadata and it seems to get stuck in this one spot and I can't seem to figure out why.... it has run this same code with different imdb pages just fine (this is the 29th iteration of a new page)
I am using c#!
The code:
private string Match(string regex, string html, int i = 1)
{
return new Regex(regex, RegexOptions.Multiline).Match(html).Groups[i].Value.Trim();
}
regex parameter string contents:
<title>.*?\\(.*?(\\d{4}).*?\\).*?</title>
html parameter string contents: too big to paste here, but literally the html string representation of http://www.imdb.com/title/tt4422748/combined
if in chrome, you can view easily with:
view-source:http://www.imdb.com/title/tt4422748/combined
I have paused execution in visual studio and stepped forward, it continues to run but just hangs (it doesn't let me step, it just runs). If i hit pause again it will return to the same spot with the same parameter values (and no I am not calling it in an infinite loop. I'm pretty new to Regex so any help would be appreciated!