0

I came across http://xpather.com, which lets you write XPath expressions. It comes with a sample XML document which is similar to this one:

<app>
   <abstract>a</abstract>
   <description>
      <subject>s1</subject>
      <subject>s2</subject>
   </description>
   <extra-notes>
      <note>n1</note>
      <note>n2</note>
      <note>n3</note>
      <note>n4</note>
   </extra-notes>
</app>

and has a sample XPath expression: .//*[self::abstract or self::subject or self::note][position() <= 2]

The result sequence is

   <abstract>a</abstract>
   <subject>s1</subject>
   <subject>s2</subject>
   <note>n1</note>
   <note>n2</note>

Now I am trying to understand why the result is like this: The first predicate selects all abstract, subject and note elements and the second predicate limits this to position() <= 2. My intuitive expectation would be that the XPath expression only selects two nodes (abstract and subject1), but it actually selects the first two elements from each of the selected nodes. What is exactly going on?

Just for testing: here is my simple stylesheet:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xsl:output indent="yes"/>
  <xsl:template match="/">
    <e>
      <xsl:copy-of select=".//*[self::abstract or self::subject or self::note][position() &lt;= 2]"/>
    </e>
  </xsl:template>
</xsl:stylesheet>
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
topskip
  • 16,207
  • 15
  • 67
  • 99
  • I have provided you with a correct solution and a pointer to complete explanation. Do you still have problems understanding this? – Dimitre Novatchev Aug 14 '21 at 16:43
  • 1
    Dimitre, thank you very much. It took me some time to ponder about the context of the result and the precedence of `[]` vs the PathExpr and I think that I now understand what is going on. – topskip Aug 15 '21 at 15:51

2 Answers2

1

The .//*[self::abstract or self::subject or self::note] means ./descendant-or-self::node()/*[self::abstract or self::subject or self::note] and your positional predicate applies to the last step with /*[self::abstract or self::subject or self::note], thus selects any first or second child element selected by *[self::abstract or self::subject or self::note].

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • I am still confused. The first predicate evaluates to a sequence of nodes, correct? So (a,s1,s2,n1,n2,n3,n4). Now the second predicate is applied. Why is the positional argument carried over to the second predicate? I obviously don't understand what is going on. Is my assumption wrong that the second argument is applied after getting the result of the first predicate? – topskip Aug 13 '21 at 09:17
  • 3
    @topskip, I would rather say that each item selected by `/*[self::abstract or self::subject or self::note]` is further filtered by `[position() le 2]` whether it is the first or second child. The result you want to apply the last predicate to the whole sequence would be given by `(.//*[self::abstract or self::subject or self::note])[position() <= 2]`. – Martin Honnen Aug 13 '21 at 09:34
  • I thought that using the predicate, the context item loses its connection from the tree, thus no further filtering on position() would be possible. But now thinking that the selected nodes are still part of the tree and still have their context (so the second predicate applies to the tree again) makes much sense and is obviously the answer to my question. Thank you and Dimitre very much for your help! Very much appreciated! – topskip Aug 15 '21 at 15:49
1

Use:

(//*[self::abstract or self::subject or self::note])[not(position() > 2)]

For explanation see this answer:

XPath query to get nth instance of an element :

Remember: The [] operator has higher precedence (priority) than the // abbreviation.

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431