41

I have a following xml:

<doc>
    <divider />
    <p>text</p>
    <p>text</p>
    <p>text</p>
    <p>text</p>
    <p>text</p>
    <divider />
    <p>text</p>
    <p>text</p>
    <divider />
    <p>text</p>
    <divider />
</doc>

I want to select all p nodes after first divider element until next occurrence of divider element. I tried with following xpath:

//divider[1]/following-sibling::p[following::divider]

but the problem is it selects all p elements before last divider element. I'm not sure how to do it using xpath 1.

Mirko
  • 2,231
  • 2
  • 21
  • 17

4 Answers4

39

Same concept as bytebuster, but a different xpath:

/*/p[count(preceding-sibling::divider)=1]
Daniel Haley
  • 51,389
  • 6
  • 69
  • 95
21

Here is a general XPath expression:

/*/divider[$k]
    /following-sibling::p
       [count(.|/*/divider[$k+1]/preceding-sibling::p)
       =
        count(/*/divider[$k+1]/preceding-sibling::p)
       ]

If you substitute $k with 1 then exactly the wanted p nodes are selected.

if you substitute $k with 2 then all p elements between the 2nd and 3rd divider , ..., etc.

Explanation:

This is a simple application of the Kayessian XPath 1.0 formula for node-set intersection:

$ns1[count(.|$ns2) = count($ns2)]

selects all the nodes that belong both to the nodesets $ns1 and $ns2.

In this specific case we substitute $ns1 with:

/*/divider[$k]/following-sibling::p

and we substitute $ns2 with:

/*/divider[$k+1]/preceding-sibling::p
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
8

I think there's a much simpler and probably faster solution: you want all preceding siblings of the second divider that have at least one preceding sibling divider:

/doc/divider[2]/preceding-sibling::p[preceding-sibling::divider]

It gets a bit more complex, of course, if you want to find the paras between the second and third dividers: then you want something more like Daniel Haley's solution.

Daniel Haley
  • 51,389
  • 6
  • 69
  • 95
Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • Allows start and end divider tags to vary so, for instance, one may select items between an `H1` and the first `TABLE`. Simple and flexible. – vhs Apr 24 '20 at 20:56
6

What about selecting all p having exactly one element divider as preceding-sibling ?

//doc/p[preceding-sibling::divider[1] and not (preceding-sibling::divider[2])]
Be Brave Be Like Ukraine
  • 7,596
  • 3
  • 42
  • 66