2

i am getting source of a website using Html Agility pack which is different than the code when i inspect with firebug.i have searched many things but still not getting clear of what i should do.Source is different than the code when i inspect please tell me how to get javascript code too with that Html. Even when i disable javascript in my browser i still cannot get the Javascript code along the source. i am using

string url="";
HtmlDocument doc = new HtmlDocument();
                WebClient client = new WebClient();
                html = client.DownloadString(url);
                doc.LoadHtml(html);

to get source tell me if i should need a request and response method to get JS code too.

har07
  • 88,338
  • 12
  • 84
  • 137
Shah Rukh
  • 227
  • 1
  • 2
  • 11

2 Answers2

2

To expand on @alecxe answer, you can use Selenium* to load your target page like a real browser would do, and then pass the result to HtmlAgilityPack for further processing :

using OpenQA.Selenium;

.....

IWebDriver driver = new PhantomJS.PhantomJSDriver();
driver.Navigate().GoToUrl(url);
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(driver.PageSource);

alternatively, you can just run your query (XPath or CSS selector) using Selenium directly, for example :

var result = driver.FindElements(By.XPath("your query"));

//print HTML of the returned elements
foreach (var item in result)
{
    Console.WriteLine(item.GetAttribute("outerHTML"));
}

*) Need to download Selenium first, as well as the driver i.e PhantomJS, Firefox, etc. Selenium can be installed to your project easily from NuGet.

Community
  • 1
  • 1
har07
  • 88,338
  • 12
  • 84
  • 137
  • 1
    Thanks a lot i have added selenium and phantomJS in my project now hope so it will work – Shah Rukh Apr 02 '16 at 13:22
  • i used that, it is working fine giving me extra code too but still not giving me that li inside an ol in a div which i want, i can see that div only when i inspect that through firebug with firefox, or give me suggestion what should i do – Shah Rukh Apr 02 '16 at 15:13
1

For that you would need a real browser. Consider automating a browser (which can be headless - see PhantomJS) with the help of selenium.

See also:

Community
  • 1
  • 1
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195