10

I can't clearly understand the differences between using //element and /descendant::element when selecting multiple children of a base element in XPath.

Given this HTML snippet

<html>
<body>
<div class="popupContent">
  <table>
    <tr class="aclass"><td> Hello </td> <td> <input type="text" value="FIRST" /> </td></tr>
    <tr class="aclass"><td> Goodbye </td> <td> <input type="text" value="SECOND" /> </td></tr>
  </table>
</div>
</body>
</html>

I need to select each input based on its positioning in the table. //div[@class='popupContent']//input[1] this selects the first input //div[@class='popupContent']//input[2] this gives error //div[@class='popupContent']/descendant::input[1] this again selects the first input //div[@class='popupContent']/descendant::input[2] this select the second input

Using /descendant::input does what I need: grab all inputs and let me select by position.
How does // differ? Why does it return only the first element and not the ones after?

I'm aware of this other question but the answer basically says they're aliases and point to the documentation, which I cannot understand and lacks a concrete example. Difference with that question is that my need is to select multiple children elements, and // doesn't allow it.

Alessandro Da Rugna
  • 4,571
  • 20
  • 40
  • 64
  • first expression `//div[@class='popupContent']//input[1]` returns both inputs. – splash58 Nov 25 '15 at 14:20
  • Possible duplicate of [What's the difference between //node and /descendant::node in xpath?](https://stackoverflow.com/questions/1537771/whats-the-difference-between-node-and-descendantnode-in-xpath) – Robert Columbia Jul 31 '19 at 15:48
  • @RobertColumbia as I mentioned in the question itself, I'm asking for a different clarification. – Alessandro Da Rugna Jul 31 '19 at 18:40

2 Answers2

15

According to XPath 1.0, §2.5 Abbreviated Syntax:

// is short for /descendant-or-self::node()/

So div[@class='popupContent']//input[1] (same as div[@class='popupContent']/descendant-or-self::node()/child::input[1]) will:

  1. go to all descendants (children, children of children and so on) of the divs with that "popupContent" class,
  2. then look for <input> children
  3. and finally select the first child of its parent ([1] predicate)

div[@class='popupContent']//input[2] is very similar except the last thing is to select the 2nd child. And none of the <input>s are 2nd child of their parent.

div[@class='popupContent']/descendant::input[2] on the other hand will:

  1. go to all descendants of the divs with that class,
  2. selecting only <input> elements, and build a node-set out of them
  3. finally select the 2nd element in that node-set, in document order

You can read about predicates and axes in §2.4 Predicates. Relevant pieces:

(...) the ancestor, ancestor-or-self, preceding, and preceding-sibling axes are reverse axes; all other axes are forward axes.

[Thus descendant is a forward axis.]

The proximity position of a member of a node-set with respect to an axis is defined to be the position of the node in the node-set ordered in document order if the axis is a forward axis (...). The first position is 1.

A predicate filters a node-set with respect to an axis to produce a new node-set. For each node in the node-set to be filtered, the PredicateExpr is evaluated with that node as the context node, with the number of nodes in the node-set as the context size, and with the proximity position of the node in the node-set with respect to the axis as the context position;

Community
  • 1
  • 1
paul trmbrth
  • 20,518
  • 4
  • 53
  • 66
4

The only difference between //X and /descendant::X is when X contains a positional predicate, for example //x[1] vs /descendant::x[1]. In this situation //x[1] selects every x element that is the first x child of its parent element, whereas /descendant::x[1] selects the first descendant x overall. You can work this out by remembering that //x[1] is short for /descendant-or-self::node()/child::x[1]

Michael Kay
  • 156,231
  • 11
  • 92
  • 164