393

The XPath bookstore/book[1] selects the first book node under bookstore.

How can I select the first node that matches a more complicated condition, e.g. the first node that matches /bookstore/book[@location='US']

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
ripper234
  • 222,824
  • 274
  • 634
  • 905

9 Answers9

568

Use:

(/bookstore/book[@location='US'])[1]

This will first get the book elements with the location attribute equal to 'US'. Then it will select the first node from that set. Note the use of parentheses, which are required by some implementations.

Note, this is not the same as /bookstore/book[1][@location='US'] unless the first element also happens to have that location attribute.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Jonathan Fingland
  • 56,385
  • 11
  • 85
  • 79
  • How I could do same for //bookstore/book[@location='US'] ? – Alexander V. Ilyin Mar 11 '11 at 19:04
  • 7
    This will get all books from 'US'. (/bookstore/book[@location='US'])[1] will get the first one. – Kevin Driedger Apr 17 '12 at 19:39
  • 3
    @KevinDriedger `/bookstore/book[@location='US'][1]` does not return all books from 'US'. I have tested it mutiple times and under different languages' xpath implementations. `/bookstore/book[@location='US'][1]` returns the first 'US' book under a bookstore. If there are mutiple bookstores, then it will return the first from each. This is what the OP asked for (the first node under bookstore). Your version returns only one book from all bookstores (the first match). – Jonathan Fingland Apr 18 '12 at 18:38
  • 3
    @JonathanFingland you misunderstood - read KevinDriedger's answer again, along with context of AlexanderV.Ilyin's question. You both mean the same thing. – kiedysktos Apr 05 '16 at 06:27
206

/bookstore/book[@location='US'][1] works only with simple structure.

Add a bit more structure and things break.

With-

<bookstore>
 <category>
  <book location="US">A1</book>
  <book location="FIN">A2</book>
 </category>
 <category>
  <book location="FIN">B1</book>
  <book location="US">B2</book>
 </category>
</bookstore> 

/bookstore/category/book[@location='US'][1] yields

<book location="US">A1</book>
<book location="US">B2</book>

not "the first node that matches a more complicated condition". /bookstore/category/book[@location='US'][2] returns nothing.

With parentheses you can get the result the original question was for:

(/bookstore/category/book[@location='US'])[1] gives

<book location="US">A1</book>

and (/bookstore/category/book[@location='US'])[2] works as expected.

John Smith
  • 7,243
  • 6
  • 49
  • 61
tkurki
  • 2,069
  • 1
  • 12
  • 2
  • 12
    Author of the accepted answer here. The OP 's question regarded `/bookstore/book[1]` and NOT `(/bookstore/book)[1]`. The case you've provided is not the same as the one OP asked for. Presumably, OP accepted my answer as it did what he expected (and requested). – Jonathan Fingland Apr 18 '12 at 18:47
  • This answer provided helped me for this peculiar case. Can someone explain why it won't handle "more complicated situations"? Since basically it does find a list with two items, the [2] should just pick it up (in my world) – Skurpi May 11 '12 at 09:01
  • I also find this answer to be more correct than the selected answer, as in my case, I also had a more complex structure where simply adding [1] returned multiple nodes. Thanks! – mydoghasworms Sep 18 '12 at 11:52
  • 2
    Parentheses works! You can also add more path after (..)[1], like: `'(//div[text() = "'+ name +'"])[1]/following-sibling::*/div/text()'`. In case there are many nodes matches `name`. – Hlung Dec 18 '12 at 10:57
  • This answer is not useful. It is not "a better way" but a different way which depends on more specifics from the OP about what he would want. In this example ()[2] will return the second book in any book store from the US. But without the parens would return the second book in all bookstores from the US, which as Jon pointed out is closer to his original example. And adding category did *nothing* for this example. – Gerard ONeill Aug 27 '13 at 21:09
  • 2
    I'm changing my opinion. After some distance, I get what this answer was saying, and if I didn't see the OP's example I woulda voted for this. I suppose I was reacting to the tone of this answer; if @tkurki had explained a little more about separating the condition from the selection of the first node, I woulda instantly seen it. Perhaps the same for JonFingland. – Gerard ONeill Nov 05 '13 at 21:03
61

As an explanation to Jonathan Fingland's answer:

  • multiple conditions in the same predicate ([position()=1 and @location='US']) must be true as a whole
  • multiple conditions in consecutive predicates ([position()=1][@location='US']) must be true one after another
  • this implies that [position()=1][@location='US'] != [@location='US'][position()=1]
    while [position()=1 and @location='US'] == [@location='US' and position()=1]
  • hint: a lone [position()=1] can be abbreviated to [1]

You can build complex expressions in predicates with the Boolean operators "and" and "or", and with the Boolean XPath functions not(), true() and false(). Plus you can wrap sub-expressions in parentheses.

Tomalak
  • 332,285
  • 67
  • 532
  • 628
  • Is it possible to have an array of locations (like [1,3,5:7,9]) without using multiple "and" operators? – M.Hossein Rahimi May 08 '21 at 18:34
  • 1
    @M.HosseinRahimi In XPath 1.0, no. In XPath 2.0, sequences and the `=` operator do the trick: `[position() = (1,3,5,6,7,9)]`. – Tomalak May 08 '21 at 18:59
16

The easiest way to find first english book node (in the whole document), taking under consideration more complicated structered xml file, like:

<bookstore>
 <category>
  <book location="US">A1</book>
  <book location="FIN">A2</book>
 </category>
 <category>
  <book location="FIN">B1</book>
  <book location="US">B2</book>
 </category>
</bookstore> 

is xpath expression:

/descendant::book[@location='US'][1]

Gee-Bee
  • 3,195
  • 1
  • 17
  • 13
  • I don't know why you added 'category' to the (presumptive) xml. I'm down voting this because it answers a question that the OP didn't ask. – samwyse Jan 28 '22 at 18:32
12
    <bookstore>
     <book location="US">A1</book>
     <category>
      <book location="US">B1</book>
      <book location="FIN">B2</book>
     </category>
     <section>
      <book location="FIN">C1</book>
      <book location="US">C2</book>
     </section>
    </bookstore> 

So Given the above; you can select the first book with

(//book[@location='US'])[1]

And this will find the first one anywhere that has a location US. [A1]

//book[@location='US']

Would return the node set with all books with location US. [A1,B1,C2]

(//category/book[@location='US'])[1]

Would return the first book location US that exists in a category anywhere in the document. [B1]

(/bookstore//book[@location='US'])[1]

will return the first book with location US that exists anywhere under the root element bookstore; making the /bookstore part redundant really. [A1]

In direct answer:

/bookstore/book[@location='US'][1]

Will return you the first node for book element with location US that is under bookstore [A1]

Incidentally if you wanted, in this example to find the first US book that was not a direct child of bookstore:

(/bookstore/*//book[@location='US'])[1]
iZian
  • 343
  • 3
  • 10
  • I don't know why you added 'category' to the (presumptive) xml. I'm down voting this because it answers a question that the OP didn't ask. – samwyse Jan 28 '22 at 18:33
  • @samwyse because the OP provided no more context around what other information was in their source data. So you answer according to what you think their data might be like, and provide a wider context so that the OP and people finding this question for the same and similar issues can learn more using practical examples. You'll notice I have a book under bookstore. Unlike in your other copy paste response answers. – iZian Jan 31 '22 at 13:29
  • the OP specifies that _`bookstore/book[1]` selects the first book node under bookstore_ which implies that there are no intervening levels. otherwise, i expect they would've used `bookstore//book[1]` – samwyse Mar 10 '23 at 20:14
  • You might expect that. Others might not. I would not expect anyone to presume a schema from an xPath to 1 node. I have seen more times than I have not, someone not realise that they're not even considering all the nodes with an xPath because of this very mistake. Depending on the entire context one could say that one method would be more compatible than another for future schema changes. And one might not want that and could pick the other. Only if they know all the information before they make their decision. – iZian Mar 13 '23 at 12:47
5

Use the index to get desired node if xpath is complicated or more than one node present with same xpath.

Ex :

(//bookstore[@location = 'US'])[index]

You can give the number which node you want.

frianH
  • 7,295
  • 6
  • 20
  • 45
2

if namespace is provided on the given xml, its better to use this.

(/*[local-name() ='bookstore']/*[local-name()='book'][@location='US'])[1]
Ed Bangga
  • 12,879
  • 4
  • 16
  • 30
0

for ex.

<input b="demo">

And

(input[@b='demo'])[1]
Exception_al
  • 1,049
  • 1
  • 11
  • 21
0

With help of an online xpath tester I'm writing this answer...
For this:

<table id="t2"><tbody>
<tr><td>123</td><td>other</td></tr>
<tr><td>foo</td><td>columns</td></tr>
<tr><td>bar</td><td>are</td></tr>
<tr><td>xyz</td><td>ignored</td></tr>
</tbody></table>

the following xpath:

id("t2") / tbody / tr / td[1]

outputs:

123
foo
bar
xyz

Since 1 means select all td elements which are the first child of their own direct parent.
But the following xpath:

(id("t2") / tbody / tr / td)[1]

outputs:

123
Mohsen Abasi
  • 2,050
  • 28
  • 30