0

I need parse a select value in html file. I have this html file:

<html>
<head></head>
<body>
    <select id="region" name="region">
        <option value="0"  selected>Všetky regiony</option> 
        <optgroup>Banskobystrický kraj</optgroup>
        <option value="k_1">Banskobystrický kraj</option>
        <option value="1">Banská Bystrica</option>
        <option value="3">Banská Štiavnica</option>
        <option value="18">Brezno</option>
        <option value="22">Detva</option>
        <option value="58">Dudince</option>
    </select>
</body>
</html>

I need get select option value and also text value in dictionary. I load this file in webBrowser component a try get select tag by ID "region".

        webBrowser1.Url = new Uri("file://\\C:\\1.html");

        if (webBrowser1.Document != null)
        {
            HtmlElement elems = webBrowser1.Document.GetElementById("region");
        }

But object elems is null, I don’t why. Any advance?

EDIT: Problem was resolved with Html Agillity Pack. Thank for everybody. I was stupid, I had rather listen to your advice with Html Agillity Pack first.

2 Answers2

0

You can do it with the HtmlAgilityPack. There are many examples of using it to parsing html. You can find via a google search. Here are a few:

http://htmlagilitypack.codeplex.com/wikipage?title=Examples&referringTitle=Home

How to use HTML Agility pack

UPDATE:

While I think using the library is a better choice, you can do it with the webbrowser control in the following manner:

    webBrowser1.DocumentCompleted += 
          new WebBrowserDocumentCompletedEventHandler(ParseOptions);

    webBrowser1.Url = new Uri("C:\\1.html", UriKind.Absolute);

    private void ParseOptions(object sender,
        WebBrowserDocumentCompletedEventArgs e)
    {
        HtmlElement elems = webBrowser1.Document.GetElementById("region");
    }

Notice that the parsing is done in the DocumentCompleted event handler.

Community
  • 1
  • 1
Garett
  • 16,632
  • 5
  • 55
  • 63
  • I dont want use a third part library –  Oct 20 '10 at 14:29
  • 2
    @Tom: And yet you instantiate a WebBrowser instance? There is no standard framework way of parsing HTML, for good reason. You have no choice but to use a third party library, WebBrowser being Microsoft's implementation. – James Dunne Oct 20 '10 at 14:33
0

Html Agility Pack is a HTML parser great parser.

Pieter van Ginkel
  • 29,160
  • 8
  • 71
  • 111
  • Thank for advance I use Html Agillity Pack. I looks goog and works fine. –  Oct 20 '10 at 20:57