0

i am using WebClient[c#]

WebClient client = new WebClient();
client.Encoding = Encoding.UTF8;
var response = client.DownloadString("http://www.iryoucai.com/PositionDefault_069.html");
var doc = new HtmlDocument();
HtmlDocumentExtensions.LoadHtml2(doc, response);
var docNode = doc.DocumentNode;
var emailNode = docNode.QuerySelector("#positionEamil");//this is a span with display=none, actually contains c-hr@cg.com.cn
Console.WriteLine(emailNode.InnerText);// output ' '

and Fizzler to crawl some web pages in a certain domain; but a hot potato comes to my hand: if the html source code contain a tag whose display style is none, then i would not get the content/text of that tag, any idea? thanks in advance!

paul cheung
  • 748
  • 2
  • 13
  • 32
  • Look at HTML if there is no value than HtmlAgilityPack will not help you - you'll need real browser if value populated by script... – Alexei Levenkov Apr 09 '14 at 05:04
  • thanks, maybe setting the display as none is for anti-crawl purpose, not only for that purpose, but included? – paul cheung Apr 09 '14 at 06:12

0 Answers0