0

I want to adress the nodes inbetween two elements; the second one is identified by an @xml:id, the first one referring to the second one via this id. More often than not, other sibling elements (that are irrelevant to this issue / should be processed as usual) are between the two elements in question.

<root>
... text i'm not interested in ...
<A ref="#id_1"/> interesting <C>text</C> no 1 <B xml:id="id_1"/>
... text i'm not interested in ...
<A ref="#id_2"/> interesting text no 2 <B xml:id="id_2"/>
... text i'm not interested in ...
</root>

What I'm looking for is an xPath command that selects for every element "A" with the attribute "ref" the nodes following this element up to the specific element "B" with the id provided in a's "ref".

So in the example given above, for the first "A", it should select

"interesting <C>text</C> no 1"

and for the second "A"

"interesting text no 2"

(and so on; the number of "A"- and "B"-elements is pretty high).

So far, my rough guess is that fn intersection could be part of the solution. (I'm using xPath 2.0.)

b_o
  • 1
  • 3

2 Answers2

0

As user choroba wrote in comment, you can get the values using XPath Axes:

//A/following-sibling::text()[1]

To get only elements with ref attribute, you can use:

//A[@ref]/following-sibling::text()[1]

Update: Maybe Kayessian method for node-set intersection can help you (see this SO):

/*/A[1]/following-sibling::node()[count(.|/*/B[1]/preceding-sibling::node()) = count(/*/B[1]/preceding-sibling::node())]

To get second occurence, just replace all [1] with [2].

Otrozone
  • 283
  • 1
  • 3
  • 8
  • Thanks! To my understanding, this would select everything from the A-element to the next sibling. What if I want it to select everything from the element A to the specific element B referenced in A's @ref? – b_o Apr 16 '19 at 09:07
0

This XPath 2.0 expression

/root/(
   for $a in A, 
       $b in B[concat('#', @xml:id) = $a/@ref][1] 
   return .//text()[$b >> .][. >> $a]
)

Selects this text nodes (added quot for clarity):

' interesting '
'text'
' no 1 '
' interesting text no 2 '

Test in https://xsltfiddle.liberty-development.net/bFN1y9t

Do note: the use of for expression for "inner join".

In XPath 1.0 there is no way to declare a closure, thus there is neither a way to make an "inner join". But if you are shure there is no overlap between starting and ending marks, you could use:

/root//text()[
  (preceding::A|preceding::B)[last()][self::A]
][(following::A|following::B)[1][self::B]
]

Or

/root//text()[
   preceding::*[self::A|self::B][1][self::A]
][following::*[self::A|self::B][1][self::B]
]

Test in http://www.xpathtester.com/xpath/a3051d2ad3af3423502b221bef6a580e

Edited Question

What I'm looking for is an xPath command that selects for every element "A" with the attribute "ref" the nodes following this element up to the specific element "B" with the id provided in a's "ref".

If you want now the nodes instead the descendant text nodes just replace the path in the expression:

XPath 2.0 expression

/root/(
   for $a in A, 
       $b in B[concat('#', @xml:id) = $a/@ref][1] 
   return node()[$b >> .][. >> $a]
)

XPath 1.0 expression

/root/node()[
  (preceding::A|preceding::B)[last()][self::A]
][(following::A|following::B)[1][self::B]
]
Alejandro
  • 1,882
  • 6
  • 13
  • thank you very much for your suggestion; it obviously works. The fact that it didn't solve my problem is due to lack of specificity in my question. I've taken the opportunity to clarify it. – b_o Apr 17 '19 at 09:27