2

I have the following XML sample for which I need a XPath query to return only node1 and node3.

<root>
    <node1 />
    <node2 anyAttribute="anyText" />
    <node3> </node3>
    <node4>anyText</node4>
    <node5>
        <anyChildNode />
    </node5>
</root>

In other words a XPath query which returns all nodes which have (simultaneously):

  1. no attributes
  2. no child nodes
  3. no or whitespace-only content

I've found some solutions (1 & 2) but which are only applicable to one of the points above at a time:

  • for 1. /root/node()[not(node())] - tested and works
  • for 2. /root/node()[not(@*)] - tested and works
  • for 3. /root/node()[string-length(normalize-space(text())) = 0] - not working (dunno why)

Yes, I know, I could use the 3 variants above together, but I would like to avoid it and I would think that for just searching for empty nodes/elements there should be an easy way, or?


I'm also limited to xPath 1.0 on .NET, since there is no progress on supporting newer versions.

Teodor Tite
  • 1,855
  • 3
  • 24
  • 31

1 Answers1

2

This XPath,

/root/*[not(@* or * or text()[normalize-space()])]

will select only node1 and node3, as requested.

Explanation: Select all element (note difference from node) children of root that have no children that are attributes (@*) or elements (*) or non-whitespace text (text()[normalize-space()]).

kjhughes
  • 106,133
  • 27
  • 181
  • 240
  • +1 works fine; was thinking that using `node()` would be faster than `*` but seems that I've confused it with `\\\`. – Teodor Tite Sep 10 '19 at 11:21