I was not able to find a built-in method that would give you the kind of path that you wanted. But I was able to create a recursive function that would do the trick. Here is the code I came up with:
private void button1_Click(object sender, EventArgs e)
{
string xmlText = textBox1.Text;
String exp = "//text()";
XmlDocument xml = new XmlDocument();
xml.LoadXml(xmlText);
//Writes the text out to a textbox
foreach (XmlNode x in xml.SelectNodes(exp))
textBox2.AppendText("(" + GetPath(x) + ", " + x.InnerText + ")\n");
}
string GetPath(XmlNode nd)
{
if (nd.ParentNode != null && nd.NodeType == XmlNodeType.Text)
{
return GetPath(nd.ParentNode);
}
else if (nd.ParentNode != null && nd.NodeType != XmlNodeType.Text)
{
var index = nd.ParentNode.ChildNodes.Cast<XmlNode>().ToList().IndexOf(nd);
string path = GetPath(nd.ParentNode);
path += (path != "") ? "/" : "";
return string.Format("{0}{1}[{2}]", path, nd.Name, index);
}
else return "";
}
I was testing it on a Form
, thus the button click event. Using //text()
to get all text nodes was the easy part. Coming up with a recursive function to build the path was a little harder than I expected. It took me a bit to figure out that by casting ParentNode.ChildNodes
to a collection of XmlNode
, then converting to a list, we can use the IndexOf()
method of List
to get the index.
Results:
(div[0]/p[0], Title)
(div[0]/ul[1]/li[0], Features)
(div[0]/ul[2]/li[0], Name)
(div[0]/ul[2]/li[1], Age)
(div[0]/ul[2]/li[2], Gender)
(div[0]/h2[3], Comments)
(div[0]/p[4], Bill)
(div[0]/p[5], Link)
One caveat to this that I see, and because I don't know what application you will be using this for, but if you are going to be using this to get elements from HTML, the LoadXML()
function may break. "Valid" HTML is not necessarily valid XML, and the load may fail.