0

i want to navigate to a specific website, and i want then to be displayed in the web browser only a portion of the website, which starts with:

<div id="dex1" ...... </div>

I know i need to get the element by id, but firstly i tried writing this:

string data = webBorwser.Document.Body.OuterHtml;

So from data i need to grab that content "id" and display it and the rest to be deleted.

Any idea on this?

John Saunders
  • 160,644
  • 26
  • 247
  • 397
user3712833
  • 15
  • 1
  • 6
  • I would do an `HttpWebRequest` and extract only one
    from the html response, so you can display it to the user, on a browser. Problem with that is that you have to keep all of the `
    – Andrei Dvoynos Aug 09 '14 at 15:50
  • i would like a code example, if is possible – user3712833 Aug 09 '14 at 15:52
  • im working with windows form – user3712833 Aug 09 '14 at 15:52
  • http://stackoverflow.com/questions/11573919/reading-response-from-url-using-http-web-request – Andrei Dvoynos Aug 09 '14 at 15:54
  • yes but i need to edit the response before i get it, cus im getting the whole response (whole html) displayed in web browser – user3712833 Aug 09 '14 at 15:57
  • You can use the code from the answer above, edit the string however you like(remove all the content you don't need) and then set the html on the webbrowser like this: `webBrowser1.DocumentText = html;` This would bring other problems though, like relative references and loading of external images, etc. Without actually knowing what you're really looking for, it's hard to answer what would be the best approach. Do you need only text? With little or no formatting? – Andrei Dvoynos Aug 09 '14 at 16:02
  • i need to extract a div id="dex1"... which is a map like in google map, i want to be displayed only that into the webbrowser and the rest to be gone. – user3712833 Aug 09 '14 at 16:05
  • I have edited your title. Please see, "[Should questions include “tags” in their titles?](http://meta.stackexchange.com/questions/19190/)", where the consensus is "no, they should not". – John Saunders Aug 09 '14 at 16:24

2 Answers2

0
webBrowser1.DocumentCompleted += (sender, e) =>
{
    webBrowser1.DocumentText = webBrowser1.Document.GetElementById("dex1").OuterHtml;
};

On second thoughts, don't do that, setting the DocumentText property causes the DocumentCompleted event to fire again. So maybe do:

webBrowser1.DocumentCompleted += webBrowser1_DocumentCompleted;

void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
    webBrowser1.DocumentCompleted -= webBrowser1_DocumentCompleted;
    webBrowser1.DocumentText = webBrowser1.Document.GetElementById("dex1").OuterHtml;
}

Although in most real world cases I'd expect you'd get better results injecting some javascript to do the DOM manipulation, a la Andrei's answer.

Edit: to just replace everything inside the body tag which might if you're lucky maintain all the required styling and scripts if they're all in the head don't reference any discarded context, you may have some joy with:

webBrowser1.Document.Body.InnerHtml = webBrowser1.Document.GetElementById("dex1").OuterHtml;
stovroz
  • 6,835
  • 2
  • 48
  • 59
0

So, as you probably need a lot of external resources like scripts and images. You can add some custom javascript to modify the DOM however you like after you have loaded the document from your website. From How to update DOM content inside WebBrowser Control in C#? it would look something like this:

HtmlElement headElement = webBrowser1.Document.GetElementsByTagName("head")[0];
HtmlElement scriptElement = webBrowser1.Document.CreateElement("script");
IHTMLScriptElement domScriptElement = (IHTMLScriptElement)scriptElement.DomElement;
domScriptElement.text = "function applyChanges(){ $('body >').hide(); $('#dex1').show().prependTo('body');}";
headElement.AppendChild(scriptElement);

// Call the nextline whenever you want to execute your code
webBrowser1.Document.InvokeScript("applyChanges");

This is also assuming that jquery is available so you can do simple DOM manipulation.

The javascript code is just hiding all children on the body and then prepending the '#dex' div to the body so that it's at the top and visible.

Community
  • 1
  • 1
Andrei Dvoynos
  • 1,126
  • 1
  • 10
  • 32