0

I have a HTML like this :

<ol class="list">
   <li class="list-item " id="37647629">
      <!---->
      <div>
         <!---->
         <div>
            <!---->
            <book class="book">
              <div class="title">
                 someText
              </div>    
              <div class="year">
                 2022
              </div>               
            </book>
         </div>
         <!---->         
      </div>
      <!---->
   </li>
   <li class="list-item " id="37647778">
      <!---->
      <div>
         <!---->
         <div>
            <!---->
            <book class="book">
              <div class="title">
                 someOtherText
              </div>    
              <div class="year">
                 2014
              </div>            
            </book>
         </div>
      </div>
      <!---->
   </li>   
</ol>

I want to get the first book title and year, directly with two xPath expression. I tried :

$x('//book') => Ok, get the two books list

$x('//book[0]') => Empty list    

$x('//book[0]/div[@class="title"]') => Nothing

Seems I have to do this :

$x('//book')[0]

and then process title, but why I can't do this just with Xpath and directly access the first title with a Xpath expression ?

user2178964
  • 124
  • 6
  • 16
  • 40

2 Answers2

1

This will give you the first book title

"(//book)[1]//div[@class='title']"

And this gives the first book year

"(//book)[1]//div[@class='year']"
Prophet
  • 32,350
  • 22
  • 54
  • 79
0

You're missing that XPath indexing starts at 1; JavaScript indexing starts at 0.

  • $x('//book') selects all book elements in the document.
  • $x('//book[0]') selects nothing because XPath indexing starts at 1. (It also signifies to select all book elements that are the first among siblings — not necessarily the same as the first of all book elements in the document.)
    • $x('//book')[0] would select the first book element because JavaScript indexing starts at 0.
    • $x('(//book)[1]') would select the first book element because XPath indexing starts at 1.

To select the first div with class of 'title', all in XPath:

$x('(//div[@class="title"])[1]')

or, using JavaScript to index:

$x('(//div[@class="title"])')[0]

To return just the string value without the leading/trailing whitespace, wrap in normalize-space():

$x('normalize-space((//div[@class="title"])[1])')

Note that normalize-space() will also consolidate internal whitespace, but that is of no consequence with this example.

See also

kjhughes
  • 106,133
  • 27
  • 181
  • 240