3

My requirement is to extract the required content from a web page. The page has a section which is being populated using ajax. When i view in page source it is not showing the content loaded using ajax. The section content will change based on check box selected. If we select 'India' check box then the section will display all the details of India. The page source will show only default content not the content displayed using ajax. I checked the page source after selecting the check box, still it shows only default value. How to get that section content,

Maddy
  • 263
  • 1
  • 9
  • 24

2 Answers2

4

In C# you can use HTMLAgilityPack to craw data, but if you use webBrowser.DocumentText, you can't load ajax content from webpage to get xpath. So after webBrowser control loaded webpage completely. In Document_Complete method you add some codes below:

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
this.webBrowser1.Document;
IHTMLDocument2 currentDoc =(IHTMLDocument2)this.webBrowser1.Document.DomDocument;

doc.LoadHtml(currentDoc.activeElement.innerHTML);
khr055
  • 28,690
  • 16
  • 36
  • 48
0

Use Firebug under Firefox. Under NET tab you will see the extra content loaded.

Zuuum
  • 1,495
  • 11
  • 18
  • Thank you Zuuum. I was help to see the extra content loaded. Can you help me in how to get access to that content using C# code. So that i will be to extract required content from web page. – Maddy Aug 24 '12 at 08:59