4

Hello i making HttpWebResponse and getting the HtmlPage with all data that i need for example table with date info that i need to save them to array list and save it to xml file

Example of html Page

<table>
<tr>
<td class="padding5 sorting_1">
<span class="DateHover">01.03.14</span>
</td>
<td class="padding5 sorting_1">
<span class="DateHover" >10.03.14</span>
</td>
</tr>
</table>

my code that not working i using the HtmlAgilityPack

 private static string GetDataByIClass(string HtmlIn, string ClassToGet)
    {
        HtmlAgilityPack.HtmlDocument DocToParse = new HtmlAgilityPack.HtmlDocument();
        DocToParse.LoadHtml(HtmlIn);
        HtmlAgilityPack.HtmlNode InputNode = DocToParse.GetElementbyId(ClassToGet);//here is the problem i dont have method DocToParse.GetElementbyClass
        if (InputNode != null)
        {
            if (InputNode.Attributes["value"].Value != null)
            {
                return InputNode.Attributes["value"].Value;
            }
        }

        return null;
    }

Sow i need to read this data to get the date 01.03.14 and 10.02.14 for be able to save this to array list (and then to xml file)

Sow any ideas how can i get this dates(01.03.14 and 10.02.14)?

Vladimir Potapov
  • 2,347
  • 7
  • 44
  • 71
  • http://stackoverflow.com/questions/846994/how-to-use-html-agility-pack see the section "HtmlAgilityPack uses XPath syntax, and though many argues that it is poorly documented, I had no trouble using it with help from this XPath documentation: http://www.w3schools.com/xpath/xpath_syntax.asp" – ray Apr 13 '14 at 08:29
  • check for QuerySelector – csharpwinphonexaml Apr 13 '14 at 08:34

1 Answers1

7

Html Agility Pack has XPATH support, so you can do something like this:

foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//span[@class='" + ClassToGet + "']"))
{
    string value = node.InnerText;
    // etc...
}

This means: get all SPAN elements from the top of the document (first /), recursively (second /) that have a given CLASS attribute. Then for each element, get the inner text.

Simon Mourier
  • 132,049
  • 21
  • 248
  • 298
  • It won't work if the node has more than one classes, for example if we are searching for nodes with "todo-task" class it won't give us nodes that their class property is set to "todo-task dark-theme border-green" – mhn2 Nov 11 '21 at 07:06
  • @mhn2 - Semantically, the Html Agility Pack has no notion of what the HTML class concept is, @class='something' just search for a 'class' HTML attribute that contains 'something' whatever that means. – Simon Mourier Nov 11 '21 at 07:41