How would you find all nodes between two H3's using XPATH?
5 Answers
In XPath 1.0 one way to do this is by using the Kayessian method for node-set intersection:
$ns1[count(.|$ns2) = count($ns2)]
The above expression selects exactly the nodes that are part both of the node-set $ns1
and the node-set $ns2
.
To apply this to the specific question -- let's say we need to select all nodes between the 2nd and 3rd h3
element in the following XML document:
<html>
<h3>Title T31</h3>
<a31/>
<b31/>
<h3>Title T32</h3>
<a32/>
<b32/>
<h3>Title T33</h3>
<a33/>
<b33/>
<h3>Title T34</h3>
<a34/>
<b34/>
<h3>Title T35</h3>
</html>
We have to substitute $ns1
with:
/*/h3[2]/following-sibling::node()
and to substitute $ns2
with:
/*/h3[3]/preceding-sibling::node()
Thus, the complete XPath expression is:
/*/h3[2]/following-sibling::node()
[count(.|/*/h3[3]/preceding-sibling::node())
=
count(/*/h3[3]/preceding-sibling::node())
]
We can verify that this is the correct XPath expression:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:copy-of select=
"/*/h3[2]/following-sibling::node()
[count(.|/*/h3[3]/preceding-sibling::node())
=
count(/*/h3[3]/preceding-sibling::node())
]
"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the XML document presented above, the wanted, correct result is produced:
<a32/>
<b32/>
II. XPath 2.0 solution:
Use the intersect
operator:
/*/h3[2]/following-sibling::node()
intersect
/*/h3[3]/preceding-sibling::node()

- 240,661
- 26
- 293
- 431
-
so the one use case this doesn't fix, is content after the last H3.. i'm curiouos what modification would be needed to be able to pict this up. – klumsy Oct 02 '10 at 00:28
-
@klumsy: Just prepend the existing expression with `"/*/h3[2]/following-sibling::node()[not(/*/h3[3])] | ` – Dimitre Novatchev Oct 02 '10 at 01:39
-
how to loop with this expression over more "slices"? How can I replace 2 and 3 with variables in a loop over all h3's? – zypro Feb 19 '18 at 13:43
-
1@zypro, It's simple: Have variables `$startInd`, `$endInd`, and declare them within your "loop" with the necessary values. Also, replace in the expression "`2`" with `$startInd` and "`3`" with `$endInd`. The XPath 2.0 expression can be even this: `for $i in 1 to count(/*/h3) -1 return /*/h3[$i]/following-sibling::node() intersect /*/h3[$i+1]/preceding-sibling::node()` – Dimitre Novatchev Feb 19 '18 at 15:58
Other XPath 1.0 solution when you know both marks are the same element (this case h3
):
/html/body/h3[2]/following-sibling::node()
[not(self::h3)]
[count(preceding-sibling::h3)=2]
A more general solution - in XPath 2.0 - assuming you want nodes at all tree depths between the two h3 elements, which would not necessarily be siblings.
/path/to/first/h3/following::node()[. << /path/to/second/h3]

- 6,413
- 2
- 18
- 18
Based on dimitre-novatchev excellent answer I can up with the follow solution that rather than hardcoding [2] and [3] for the different H3s i just give the content of the header of the first item.
//h3[text()="Main Page Section Heading"]/following-sibling::node()
[ count(.|//h3[text()="Main Page Section Heading"]/following-sibling::h3[1]/preceding-sibling::node()) =
count(//h3[text()="Main Page Section Heading"]/following-sibling::h3[1]/preceding-sibling::node()) ]
Where i'd want to go further though is to be able to deal with the scenario when i'm looking at the last H3 , and get everything after it, in the above case i can't get what follows the last H3.
There's another great generic solution using keys, assuming that your <h3>
tags have a unique property (e.g. its text or an id
attribute):
<xsl:key name="siblings_of_h3" match="*[not(self::h3)]" use="preceding-sibling::h3[1]/text()"/>
<xsl:template match="h3">
<!-- now select all tags belonging to the current h3 -->
<xsl:apply-templates select="key('siblings_of_h3', text())"/>
</xsl:template>
It groups all tags by their preceding <h3>

- 1,237
- 14
- 25