4

There are loads of posts similar to this.

How to get rendered html (processed by Javascript) in WebBrowser control? suggests to use something like

webBrowser1.Document.GetElementsByTagName("HTML")[0].OuterHtml;

Document is treated as an object, I have no option to use GetElementsByTagName

Copy all text from webbrowser control suggests to use DocumentText

I have Document but no DocumentText

That post also suggests webBrowser.Document.Body.InnerText;

I have the option to use webBrowser.Document but that is it. For some reason webBrowser.Document is an object and as such I can't access these methods.

Getting the HTML source through the WebBrowser control in C# also suggests using DocumentStream. Again, I don't have that.

I'm doing this in a WPF application and using WebBrowser from System.Windows.Controls

All I'm trying to is read the rendered HTML from the web page.

My code

public void Begin(WebBrowser wb)
{
   this._wb = wb;
   _wb.Navigated += _wb_Navigated;
   _wb.Navigate("myUrl");
}

private void _wb_Navigated(object sender, System.Windows.Navigation.NavigationEventArgs e)
{
    var html = _wb.Document;//this is where I need help
}
Community
  • 1
  • 1
MyDaftQuestions
  • 4,487
  • 17
  • 63
  • 120

1 Answers1

3

Your samples refer to the WinForms-WebBrowserControl. Add a reference to Microsoft.mshtml (via add-reference dialog->search) to your project.

Cast the Document-Property to

HTMLDocument

in order to access methods and properties (as stated on MSDN).

See also my GitHub-Sample:

private void WebBrowser_Navigated(object sender, NavigationEventArgs e) {
    var document = (HTMLDocument)_Browser.Document;
     _Html.Text = document.body.outerHTML;
}
earloc
  • 2,070
  • 12
  • 20
  • If you do not need to actually render the html on the screen, you may be better off using [WebClient](https://msdn.microsoft.com/de-de/library/system.net.webclient(v=vs.110).aspx) – earloc May 11 '17 at 10:10