3

How to find only nodes with at least a similar/equal sibling node using Xpath?

For example:

<root>
  <parent>
    <node>...</node>
    <node_unique>...</node_unique>
    <node>...</node>
    <another_one>...</another_one>
    <another_one>...</another_one>
  </parent>
</root>

In the example the xpath shold select only <node> and <another_one> because they are appearing more than once.

I was trying to find a solution for this for hours without success (now I think is not possible with XPath...).

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
Enrique
  • 4,693
  • 5
  • 51
  • 71

1 Answers1

7

These are impossible to select with a single XPath 1.0 expression (due to lack of range variables in XPath 1.0).

One possible solution is to select all /*/*/* elements, then to get the name of each element, using name() off that element, then to evaluate /*/*/*[name() = $currentName][2] (where $currentName should be substituted with the name just obtained. If the last expression selects an element, then the currentName is a name that occurs at least twice -- therefore you keep that element. Do so with all elements and their names. As an auxhiliarry step, one might dedup the names (and selected elements) by placing them in a hash-table.

In Xpath 2.0 it is trivial to select with a single XPath expression all children of a given parent, that have at least one other sibling with the same name:

/*/*/*
   [name() = following-sibling::*/name()
  and
    not(name() = preceding-sibling::*/name())
   ]

A much more compact expression:

/*/*/*[index-of(/*/*/*/name(), name())[2]]

XSLT 2.0 - based verification:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:copy-of select=
  "/*/*/*[index-of(/*/*/*/name(), name())[2]]"/>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the provided XML document:

<root>
  <parent>
    <node>...</node>
    <node_unique>...</node_unique>
    <node>...</node>
    <another_one>...</another_one>
    <another_one>...</another_one>
  </parent>
</root>

the above XPath expression is evaluated and the selected from this evaluation elements are copied to the output:

<node>...</node>
<another_one>...</another_one>

Note: For a related question/answer, see this.

Community
  • 1
  • 1
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • +1, Very expressive and neat answer. But, Sir I thought there might be some other way. – Cylian Sep 23 '12 at 04:48
  • @Cylian, There always is "some other way" -- we must try to provide the most succinct one, and also a solution that is efficient. – Dimitre Novatchev Sep 23 '12 at 04:57
  • I really a bit confused why using `/*/*/*`, not just `//*`, but when try to use that with astonishment I found no resultant nodes are selecting. Why this is so? Is `/*/*/*` means anything special? – Cylian Sep 23 '12 at 05:03
  • @Cylian, `/*/*/*` selects all grand-children of the top element. – Dimitre Novatchev Sep 23 '12 at 05:05