1

How can I get the element value from the web page in C#, with the WPF WebBrowser component?

For example I want to get this value 1.7655 from this page http://www.forexpros.com/currencies/usd-gel.

Thanks

kol
  • 27,881
  • 12
  • 83
  • 120
Irakli Lekishvili
  • 33,492
  • 33
  • 111
  • 169
  • 15
    Come on downvoters - you're not going to encourage people to improve their questions if you don't even say "too vague", "not a real question" or even just "RTFM". Give the guy some clues. – MickeyfAgain_BeforeExitOfSO Dec 15 '11 at 21:51
  • 3
    Acid, consider posting some code showing what you've tried. Also, look at the source code of the page you provided: `1.7655` As you can see the span has an id of `last_last` which you should be able to leverage. – JYelton Dec 15 '11 at 21:52
  • 1
    In the meantime here are some other questions of interest that may help: [1](http://stackoverflow.com/questions/4803116/how-to-get-html-web-page), [2](http://stackoverflow.com/questions/4358696/how-to-get-the-contents-of-a-html-element-using-htmlagilitypack-in-c), [3](http://stackoverflow.com/questions/5048930/how-do-i-access-a-specific-html-element-using-c) – JYelton Dec 15 '11 at 22:03
  • Acid originally added the tags `wpf` and `webbrowser` which are very important to answer the question. Please do not remove them. – kol Dec 15 '11 at 22:09
  • This question is interesting because Acid uses the WPF WebBrowser, which is incomplete: you cannot parse the tree of HtmlElements easily, because the Document property is simply an object. So I added the term "WPF WebBrowser" to the question (these words were only tags before). – kol Dec 15 '11 at 22:20

4 Answers4

5

For getting the WPF WebBrowser's content I found this solution somewhere and this seems to work, but only if the target Framework is at least .Net 4.0 and you include Microsoft.CSharp.dll (which won't be selectable if your target framework is <4.0). I added it in the LoadCompleted:

private void myBrowser_LoadCompleted(object sender, NavigationEventArgs e)
{
    dynamic doc = myBrowser.Document;
    dynamic htmlText = doc.documentElement.InnerHtml;
    string htmlstring = htmlText;
}

Add,

myBrowser.LoadCompleted += new LoadCompletedEventHandler(myBrowser_LoadCompleted);

after InitializeComponent() to be sure the method is called.

Sam
  • 7,252
  • 16
  • 46
  • 65
Dick
  • 433
  • 5
  • 13
1

There won't be a generic way to get a value from a random element - you need to know the HTML structure of the specific page, and how to find the element you are looking for. But if you know both of those, you can read the page into some sort of an HTML document (XmlDocument would work if there was a guarantee that the HTML will be structured properly) and then get the value from there.

Optionally you can run the page through some sort of HTML cleanup (maybe NTidy?) and then load it into an XmlDocument. One drawback of such an approach is the structure of the page may change during the cleanup.

Daniel Gabriel
  • 3,939
  • 2
  • 26
  • 37
1

After you call the Navigate method of the WebBrowser component of WPF to open a webpage, the DocumentCompleted event arrives, and you can safely browse the content of the page (note that sometimes this event occurs multiple times). The Document property of WebBrowser contains the HTML in an already processed format, called the DOM tree. Unfortunately, you cannot use this property easily, since it is only an object. This feature has not been completed in WPF (December 2011).

I would use the Winforms version of WebBrowser instead. You can use it in a WPF application if you embed it into a WindowsFormsHost. This class is complete: its Document property is an HtmlDocument object, with a Body property, which is an HtmlElement, which contains the content of the page. You can walk the DOM tree recursively to find the element you want (and read its InnerText), or simply process the text of the whole page using Regex or an HTML parser library.

Sam
  • 7,252
  • 16
  • 46
  • 65
kol
  • 27,881
  • 12
  • 83
  • 120
  • 2
    Now that .NET is in 4.5, would you still advise users to use the Winforms version of WebBrowser, or have these issues been resolved? – Doug Nov 19 '13 at 03:21
0

You have several options to read a value from a webpage.

  1. Get the page in a webbrowser control. Then try to find out, if the element containing your desired value has a certain name and get that element from the document property of the webbrowser control.
  2. User the HtmlAgilityPack to analyze the html of that webpage to find the element and get the value from that.
  3. Try to find out if the webpage has certain structure and use a regular expression to find the desired value (can be tricky!)

So, you see, you have many ways to find your desired value (And I think that are not all options). So, go ahead and spend some effort to get that value. And, if you got a question about a certain problem, don't hesitate and ask again on Stack Overflow. But please, spend some time in formulating your question. Remember: A good question will very often get good answers!

Fischermaen
  • 12,238
  • 2
  • 39
  • 56
  • But if you have more than once a certain text(regarding your second argument), you get maybe stuck - at least if you have the aim the get a uniqe css path or an id. I mean mean you could use a combobox to let the user choose, at least he is experienced enough. – user254197 Jun 12 '15 at 18:27