I am trying to do some screen scraping using HtmlAgilityPack using SelectNodes
and getting some values from each node returned
Here is the code
private readonly HtmlDocument _document = new HtmlDocument();
public void ParseValues(string html)
{
_document.LoadHtml(html);
var tables = _document.DocumentNode.SelectNodes("//table");
foreach (var table in tables)
{
_document.LoadHtml(table.OuterHtml);
var value = _document.DocumentNode.SelectSingleNode("//tbody[1]/tr/td[0]");
}
}
But I have noticed that when trying to select children with inside the foreach loop it actually searches from the document root. Something that is really annoying.
Questions:
Is there a way to select the values from each table returned from
SelectNodes
without having to create new document instance from theHtmlDocument
?Is there a way to dispose
HtmlDocument
, because I noticed that there is a memory leak every time I use_document.LoadHtml(html)
;