I am trying to parse a HTML page (The page isn't known and changes often, however they are always news sites). Basically, I need to pull the news out of a bunch of code downloaded from the site, which i'm trying to do with a regex like this:
Match m = Regex.Match(x.Result, @"<p>(.+?)</p>");
Obvious bad idea - it pulls down anything tagged as a paragraph.
Any better ways to pull a news article or large body of text, separated from the code, from a website?