0

I have a variable XML that might look something like this:

<content>
    <main editable="true">
        <h1>Here is my header</h1>
        <p>Here is my content</p>
    </main>
    <buttons>
        <positive editable="true">I agree!</positive>
        <negative editable="true">No - get me outta here!</negative>
    </button>
</content>

I'd like to get the XPath for all of the nodes that have the attribute "editable" that equals "true". Please note that the attributes can be at variable node levels so I can't just loop through all the nodes at one level and check for the attribute. I'd also like to use XmlReader because of the speed but if there's a better/faster way, then I'm open to that as well.

var xml = IO.File.ReadAllText(contentFilePath);
var readXML = XmlReader.Create(new StringReader(xml));

readXML.ReadToFollowing("content");

while (readXML.Read()) {
    //???
}
RichC
  • 7,829
  • 21
  • 85
  • 149
  • 2
    If you are reading all the text to a string, I think you lose the main benefit of XmlReader... – Jacob Feb 10 '16 at 21:37
  • @Jacob Ah well, that's good to know. What's the fastest way to do it if the file is on the file system? Read the bytes and provide the stream instead? – RichC Feb 10 '16 at 21:40
  • 1
    @RichC How large are your files? The performance benefit is arguably not worth it unless they are *very* large. – Mathias R. Jessen Feb 10 '16 at 21:45
  • `XmlReader` doesn't make the parent stack publicly available so you'd have to create your own pushdown stack as the reader goes into and out of elements. For an alternative, read into an `XElement` then do https://stackoverflow.com/questions/451950/get-the-xpath-to-an-xelement – dbc Feb 10 '16 at 21:47
  • 1
    Also, are you looking to construct an absolute XPath expression string *from* the element, or are you trying to figure out how to select all nodes with the editable attribute set to true *using* XPath? (sorry if this is just my poor english interpretation) – Mathias R. Jessen Feb 10 '16 at 21:49
  • I basically have a "master" XML that tells me which nodes are editable. I need a collection of XPaths from this file. Then I need open a different XML file (which is provided by the user) to pull out and display the inner contents of each XPath node from that file. I need the FULL XPath so I can easily query the values in the second XML file. – RichC Feb 10 '16 at 21:58

1 Answers1

1

Thanks to everyone's feedback, I went with this code for my solution:

Dim xml = IO.File.ReadAllText(masterLangDir)
Dim xdoc = New XmlDocument()
xdoc.LoadXml(xml)
Dim xPaths = findAllNodes(xdoc.SelectSingleNode("content"), New List(Of String))

public List<string> findAllNodes(XmlNode node, List<string> xPaths)
{
    foreach (XmlNode n in node.ChildNodes) {
        var checkForChildNodes = true;
        if (n.Attributes != null) {
            if (n.Attributes("editable") != null) {
                if (n.Attributes("editable").Value == "true") {
                    xPaths.Add(GetXPathToNode(n));
                    checkForChildNodes = false;
                }
            }
        }
        if (checkForChildNodes) {
            xPaths = findAllNodes(n, xPaths);
        }
    }
    return xPaths;
}

public string GetXPathToNode(XmlNode node)
{
    if (node.NodeType == XmlNodeType.Attribute) {
        // attributes have an OwnerElement, not a ParentNode; also they have             
        // to be matched by name, not found by position             
        return String.Format("{0}/@{1}", GetXPathToNode(((XmlAttribute)node).OwnerElement), node.Name);
    }
    if (node.ParentNode == null) {
        // the only node with no parent is the root node, which has no path
        return "";
    }

    // Get the Index
    int indexInParent = 1;
    XmlNode siblingNode = node.PreviousSibling;
    // Loop thru all Siblings
    while (siblingNode != null) {
        // Increase the Index if the Sibling has the same Name
        if (siblingNode.Name == node.Name) {
            indexInParent += 1;
        }
        siblingNode = siblingNode.PreviousSibling;
    }

    // the path to a node is the path to its parent, plus "/node()[n]", where n is its position among its siblings.         
    return String.Format("{0}/{1}[{2}]", GetXPathToNode(node.ParentNode), node.Name, indexInParent);
}

I picked up the GetXPathToNode function from this thread.

Community
  • 1
  • 1
RichC
  • 7,829
  • 21
  • 85
  • 149