12

I need to create an XPath expression that does the following:

  • Returns the element inside of 'NodeA' by default
  • Returns the element inside of 'NodeB' if it is not empty.

Here is some sample XML so that my target structure can be clearly seen (I am using MS InfoPath):

<?xml version="1.0" encoding="UTF-8"?><?mso-infoPathSolution solutionVersion="1.0.0.10" productVersion="14.0.0" PIVersion="1.0.0.0" href="file:///C:\Documents%20and%20Settings\Chris\Local%20Settings\Application%20Data\Microsoft\InfoPath\Designer3\9016384cab6148f6\manifest.xsf" ?><?mso-application progid="InfoPath.Document" versionProgid="InfoPath.Document.3"?>
<my:myFields xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/2012-09-07T14:19:10" xmlns:xd="http://schemas.microsoft.com/office/infopath/2003" xml:lang="en-us">
<my:NodeASection>
    <my:NodeA>2012-09-13</my:NodeA>
</my:NodeASection>
    <my:NodeBSection>
        <my:NodeBGroup>
            <my:NodeB>2012-09-14</my:NodeB>
        </my:NodeBGroup>
    </my:NodeBSection>
</my:myFields>

This XPath expression can be used to evaluate NodeB for the existence of text: boolean(//my:NodeB[(text())])

I have heard of the "Becker Method" but I'm not sure how that applies when both nodes exist. I'm very new to XPath and appreciate any help that can be offered.

Shrout1
  • 2,497
  • 4
  • 42
  • 65

3 Answers3

17

This XPath expression returns NodeB if it exists (and has text content) and NodeA in the other case:

//my:NodeB[text()] | //my:NodeA[text() and not(//my:NodeB[text()])]

If you want to get all sub-elements you can append /* after the selected node, like this

//my:NodeB[text()]/* | //my:NodeA[text() and not(//my:NodeB[text()])]/*
David Pärsson
  • 6,038
  • 2
  • 37
  • 52
  • You are the BOMB! I spent all day yesterday even trying to formulate that question. Man I need to take a class for this stuff... I just adapted it to my slightly more complex issue. Thanks again! – Shrout1 Sep 11 '12 at 13:19
4

A correct XPath expression is:

(//my:NodeB[node()] | //my:NodeA[not(//my:NodeB/node())])/node()

As the conditions in the predicates are mutually exclusive, only one of them can be true() and this guarantees that only one of the two nodes is selected by the expression within the brackets.

So, the expression above selects any node that is a child of: my:NodeB if it has children, or my:NodeA -- otherwize.

Here we assume as given that at most one element named my:NodeA and at most one element named my:NodeB exist in the XML document.

Another assumption is that the namespace to which the prefix my is bound has been "registered" with the XPath expression evaluator (the specific XPath implementation you are using).

Do note that in the provided XML document neither of the elements my:NodeA and my:NodeB has any element children (they both have just a text node child) -- so I assume that by "element" you actually mean "node".

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • Dimitre - thanks again! Your response does seem to be more broadly applicable. I'm new to all of this so forgive me if I am using terminology incorrectly; just trying to pick it up as I go! I don't know exactly how MS InfoPath evaluates XPath with the `my` namespace. David Parsson's answer worked well though! I belive that the [node()] check might be more broadly applicable though. – Shrout1 Sep 12 '12 at 12:56
  • @Chris, It all depends what you mean by "empty" -- and there isn't a universal definition for this. The currently accepted answer most likely interpretes "empty" as not having a text-node child. My answer interpretes "empty" as not having any child (so an element that has other elements as children, but no text-node child is considered non-empty. Other people would want that there must be at least one text-node child that isn't composed only of whitespace characters. If you define strictly your definition of "empty", then people could give a more precise answer. – Dimitre Novatchev Sep 12 '12 at 13:09
  • In this instance it is checking for missing text. InfoPath fields must be bound to an XML element (or node?) and so there will always be a pre-existing structure. On many forms I have repeating tables or sections and in that instance one could check for "existence" of the node itself. – Shrout1 Sep 12 '12 at 13:39
1

If it is safe to rely on the fact that any NodeA's will come before NodeB in document order (as implied by your sample), then a simpler and much more efficient XPATH expression to select the required element is...

(//my:NodeA[text()]|//my:NodeB)[1]

The above selects the element. If you want to select the text node of the element, then use instead...

(//my:NodeA[text()]|//my:NodeB)[1]/text()

If there is no positional relationship between NodeA and NodeB (they can come in any relative order), and you are using XPATH 2.0, then the following expression will select the required text node..

(//my:NodeA[text()],//my:NodeB)[1]/text()
Sean B. Durkin
  • 12,659
  • 1
  • 36
  • 65