I'm a total newbie with XPath... I was hoping that, given an arbitrary HTML document, I could extract a list of XPath expressions for all nodes. For example:
html
html/head
html/head/title
html/body
html/body/div
html/body/div/p
...
This is an SSCCE to illustrate what I want:
static void Main(string[] args)
{
String html = @"
<html>
<head>
<title>Test</title>
</head>
<body>
<div>
<p>Test2</p>
</div>
</body>
</html>
";
XmlDocument doc = new XmlDocument();
doc.LoadXml(html);
foreach (XmlNode node in doc.ChildNodes)
ExamineNode(node);
}
static void ExamineNode(XmlNode node)
{
Console.WriteLine(/* WHAT TO PUT HERE */); // I want to show the path to this node
foreach (XmlNode childNode in node.ChildNodes)
ExamineNode(childNode);
}
I just don't know what attribute to use, or how to compute the path. One method might be to use the node name and build a string while traversing nodes... but I thought there might be a better way. I'm looking for the best way to do this.
Similar questions have been asked here and here, but I'm looking for tips on how to implement this in C# in as simple a manner as possible.