25

How would you find all nodes between two H3's using XPATH?

klumsy
  • 4,081
  • 5
  • 32
  • 42

5 Answers5

33

In XPath 1.0 one way to do this is by using the Kayessian method for node-set intersection:

$ns1[count(.|$ns2) = count($ns2)]

The above expression selects exactly the nodes that are part both of the node-set $ns1 and the node-set $ns2.

To apply this to the specific question -- let's say we need to select all nodes between the 2nd and 3rd h3 element in the following XML document:

<html>
  <h3>Title T31</h3>
    <a31/>
    <b31/>
  <h3>Title T32</h3>
    <a32/>
    <b32/>
  <h3>Title T33</h3>
    <a33/>
    <b33/>
  <h3>Title T34</h3>
    <a34/>
    <b34/>
  <h3>Title T35</h3>
</html>

We have to substitute $ns1 with:

/*/h3[2]/following-sibling::node()

and to substitute $ns2 with:

/*/h3[3]/preceding-sibling::node()

Thus, the complete XPath expression is:

/*/h3[2]/following-sibling::node()
             [count(.|/*/h3[3]/preceding-sibling::node())
             =
              count(/*/h3[3]/preceding-sibling::node())
             ]

We can verify that this is the correct XPath expression:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:copy-of select=
   "/*/h3[2]/following-sibling::node()
             [count(.|/*/h3[3]/preceding-sibling::node())
             =
              count(/*/h3[3]/preceding-sibling::node())
             ]
   "/>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the XML document presented above, the wanted, correct result is produced:

<a32/>

<b32/>

II. XPath 2.0 solution:

Use the intersect operator:

   /*/h3[2]/following-sibling::node()
intersect
   /*/h3[3]/preceding-sibling::node()
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • so the one use case this doesn't fix, is content after the last H3.. i'm curiouos what modification would be needed to be able to pict this up. – klumsy Oct 02 '10 at 00:28
  • @klumsy: Just prepend the existing expression with `"/*/h3[2]/following-sibling::node()[not(/*/h3[3])] | ` – Dimitre Novatchev Oct 02 '10 at 01:39
  • how to loop with this expression over more "slices"? How can I replace 2 and 3 with variables in a loop over all h3's? – zypro Feb 19 '18 at 13:43
  • 1
    @zypro, It's simple: Have variables `$startInd`, `$endInd`, and declare them within your "loop" with the necessary values. Also, replace in the expression "`2`" with `$startInd` and "`3`" with `$endInd`. The XPath 2.0 expression can be even this: `for $i in 1 to count(/*/h3) -1 return /*/h3[$i]/following-sibling::node() intersect /*/h3[$i+1]/preceding-sibling::node()` – Dimitre Novatchev Feb 19 '18 at 15:58
7

Other XPath 1.0 solution when you know both marks are the same element (this case h3):

/html/body/h3[2]/following-sibling::node()
                           [not(self::h3)]
                           [count(preceding-sibling::h3)=2]
2

A more general solution - in XPath 2.0 - assuming you want nodes at all tree depths between the two h3 elements, which would not necessarily be siblings.

/path/to/first/h3/following::node()[. << /path/to/second/h3]
Nick Jones
  • 6,413
  • 2
  • 18
  • 18
1

Based on dimitre-novatchev excellent answer I can up with the follow solution that rather than hardcoding [2] and [3] for the different H3s i just give the content of the header of the first item.

//h3[text()="Main Page Section Heading"]/following-sibling::node()
 [  count(.|//h3[text()="Main Page Section Heading"]/following-sibling::h3[1]/preceding-sibling::node()) =  
    count(//h3[text()="Main Page Section Heading"]/following-sibling::h3[1]/preceding-sibling::node())  ]

Where i'd want to go further though is to be able to deal with the scenario when i'm looking at the last H3 , and get everything after it, in the above case i can't get what follows the last H3.

Community
  • 1
  • 1
klumsy
  • 4,081
  • 5
  • 32
  • 42
0

There's another great generic solution using keys, assuming that your <h3> tags have a unique property (e.g. its text or an id attribute):

<xsl:key name="siblings_of_h3" match="*[not(self::h3)]" use="preceding-sibling::h3[1]/text()"/>

<xsl:template match="h3">
  <!-- now select all tags belonging to the current h3 -->
  <xsl:apply-templates select="key('siblings_of_h3', text())"/>
</xsl:template>

It groups all tags by their preceding <h3>

klaus triendl
  • 1,237
  • 14
  • 25