0

So I have a website which I load into my form's webbrowser control. After loading the document, I retrive the webbrowser.documenttext . I am looking here to parse specific table. But I am not finding the table in here but I see that it is being dispayed in the form browser.

I tell you that this specific table is being loaded/appended to doc by already loaded javascript code. When I right click and select the "View Source" , it pops the document with correct html.

My question is how can I get the same document referenced by ViewSource or is there any way to get the document what is being rendered on form?

2 Answers2

0

Instead of using Webbrowser Control use HtmlAgilityPackage to parse data based on your need.

 var html = new HtmlDocument();
html.LoadHtml(new WebClient().DownloadString("http://www.asp.net")); 
var root = html.DocumentNode;
var commonPosts = root.Descendants().Where(n => n.GetAttributeValue("class", 
"").Equals("common-post"));
  • But ths will just parse the raw HTML, how about dynamic HTML which has injected scripts and the scripts are making dom changes. I need the latest changed dom. – Mukesh Miraculous Mar 10 '19 at 07:32
0

Similar Existing Question

The above issue was very similar to my issue and after going thorough the answer I learnt that I somehow need to wait and poll the webbrowser to get the dynamic content.

I did not really implement the code provided in the answer but I changed my documentCompleted event as async and provided a await task delay of 5s

private async void Browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {
        await Task.Delay(5000);
        var html= wb.Document.GetElementsByTagName("HTML")[0].OuterHtml;
    }

Now I get the dynamic result. Thanks, I am feeling now.