0

I have a table and want to extract data from some data cells.

<table>
    <tr>
        <td class="label"> </td>
        <td class="data"><p><a href="http://en.wikipedia.org/wiki/Liu_Kang"><img src="http://upload.wikimedia.org/wikipedia/en/e/e2/LiuKangshaolinmonks.jpg"/></a></p>
        </td>
    </tr>
    <tr>
        <td class="label">First game</td>
        <td class="data">Mortal Kombat (1992)</td>
    </tr>
        <tr>
        <td class="label">Created by</td>
        <td class="data">John Tobias</td>
    </tr>
        <tr>
        <td class="label">Orgin</td>
        <td class="data">Earthrealm</td>
    </tr>
        <tr>
        <td class="label">Weapon</td>
        <td class="data">Nunchaku</td>
    </tr>   
        <tr>
        <td class="label">Colour</td>
        <td class="data">Red</td>
    </tr>
</table>

I would like to extract Nunchaku, this works:

/html/body//tr[5]/td[@class="data"]

But I would rather like to skip tr[5] and instead use td[contains(., 'Weapon')] but I am unsure how.

Liu Kang
  • 1,359
  • 4
  • 22
  • 45

2 Answers2

2

You need to use following-sibling::

//td[contains(., 'Weapon')]/following-sibling::td

Checkout this stack overflow question or read some documentation for more information about following-sibling.

Community
  • 1
  • 1
Pawel Miech
  • 7,742
  • 4
  • 36
  • 57
1

As an alternative: /html/body//tr[td[@class = 'label'] = 'Weapon']/td[@class = 'data'].

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110