How can I adress the nodes inbetween two elements

Question

I want to adress the nodes inbetween two elements; the second one is identified by an @xml:id, the first one referring to the second one via this id. More often than not, other sibling elements (that are irrelevant to this issue / should be processed as usual) are between the two elements in question.

<root>
... text i'm not interested in ...
<A ref="#id_1"/> interesting <C>text</C> no 1 <B xml:id="id_1"/>
... text i'm not interested in ...
<A ref="#id_2"/> interesting text no 2 <B xml:id="id_2"/>
... text i'm not interested in ...
</root>

What I'm looking for is an xPath command that selects for every element "A" with the attribute "ref" the nodes following this element up to the specific element "B" with the id provided in a's "ref".

So in the example given above, for the first "A", it should select

"interesting <C>text</C> no 1"

and for the second "A"

"interesting text no 2"

(and so on; the number of "A"- and "B"-elements is pretty high).

So far, my rough guess is that fn intersection could be part of the solution. (I'm using xPath 2.0.)

Are you only interested in text() nodes, so the `` is not included in the selected nodes? — choroba, Apr 16 '19 at 08:30
@choroba - Thanks for pointing this out - I've edited the question for more precision. — b_o, Apr 16 '19 at 08:48
Then what you select is a node-list, so you can't use just one XPath expression for all the node-lists: they would be flattened to one large node-list where you can't tell which node belongs to which A node. — choroba, Apr 16 '19 at 09:13
Can there be an A or B without any reference? Can the "interesting" chunks overlap, i.e. `123456789>`? — choroba, Apr 16 '19 at 09:49
What version of XPath do you use? I fear there's no solution in XPath 1. — choroba, Apr 16 '19 at 11:53

Otrozone · Answer 1 · 2019-04-16T15:52:14.593

0

As user choroba wrote in comment, you can get the values using XPath Axes:

//A/following-sibling::text()[1]

To get only elements with ref attribute, you can use:

//A[@ref]/following-sibling::text()[1]

Update: Maybe Kayessian method for node-set intersection can help you (see this SO):

/*/A[1]/following-sibling::node()[count(.|/*/B[1]/preceding-sibling::node()) = count(/*/B[1]/preceding-sibling::node())]

To get second occurence, just replace all [1] with [2].

edited Apr 16 '19 at 15:52

answered Apr 16 '19 at 09:00

Otrozone

283
1
3
8

Thanks! To my understanding, this would select everything from the A-element to the next sibling. What if I want it to select everything from the element A to the specific element B referenced in A's @ref? – b_o Apr 16 '19 at 09:07

Alejandro · Accepted Answer · 2019-04-17T12:57:06.270

This XPath 2.0 expression

/root/(
   for $a in A, 
       $b in B[concat('#', @xml:id) = $a/@ref][1] 
   return .//text()[$b >> .][. >> $a]
)

Selects this text nodes (added quot for clarity):

' interesting '
'text'
' no 1 '
' interesting text no 2 '

Test in https://xsltfiddle.liberty-development.net/bFN1y9t

Do note: the use of for expression for "inner join".

In XPath 1.0 there is no way to declare a closure, thus there is neither a way to make an "inner join". But if you are shure there is no overlap between starting and ending marks, you could use:

/root//text()[
  (preceding::A|preceding::B)[last()][self::A]
][(following::A|following::B)[1][self::B]
]

Or

/root//text()[
   preceding::*[self::A|self::B][1][self::A]
][following::*[self::A|self::B][1][self::B]
]

Test in http://www.xpathtester.com/xpath/a3051d2ad3af3423502b221bef6a580e

Edited Question

What I'm looking for is an xPath command that selects for every element "A" with the attribute "ref" the nodes following this element up to the specific element "B" with the id provided in a's "ref".

If you want now the nodes instead the descendant text nodes just replace the path in the expression:

XPath 2.0 expression

/root/(
   for $a in A, 
       $b in B[concat('#', @xml:id) = $a/@ref][1] 
   return node()[$b >> .][. >> $a]
)

XPath 1.0 expression

/root/node()[
  (preceding::A|preceding::B)[last()][self::A]
][(following::A|following::B)[1][self::B]
]

thank you very much for your suggestion; it obviously works. The fact that it didn't solve my problem is due to lack of specificity in my question. I've taken the opportunity to clarify it. — b_o, Apr 17 '19 at 09:27

How can I adress the nodes inbetween two elements

2 Answers2

Edited Question