1
<td>
some text here
<a href="http://blablabla">ch1</a>
</td>

What is the best way to select some text here? I want to do this with css selectors or xpath, and better without jquery. Thank you. (I know this question is very likely to be a duplicate...)

laike9m
  • 18,344
  • 20
  • 107
  • 140
  • 1
    what you want is ***the first*** text node, so just use `text()[1]` to access it, `text()` works in this case but it's not safe I think (because there may be some other text nodes added in future). – King King Nov 05 '14 at 14:23

1 Answers1

1

It is just the text() of the node:

//td/text()

Demo (using xmllint):

$ xmllint index.html --html --xpath '//td/text()'
some text here

Also, to follow @King King's comment, you may want to explicitly get the first text node by specifying an index (would help in case there are other td's text child nodes):

//td/text()[1]

Though, //td/text() works perfectly on the input you've provided.

Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • Your solution is correct, but it seems `text()` can't be used in selenium xpath selector. – laike9m Nov 05 '14 at 16:06
  • @laike9m yeah, what you can do is to follow the advice suggested [here](http://stackoverflow.com/a/12397642/771848) and "subtract" link text from the `td`'s text. – alecxe Nov 05 '14 at 16:13
  • @laike9m alternatively, you can get the `innerHTML` of the `td` and use an HTML Parser (like `BeautifulSoup` or `lxml` in case of Python) to extract the desired text. – alecxe Nov 05 '14 at 16:14
  • Thank you alecxe, I tried calling `.text` on the `td` webelement and got the text I want. – laike9m Nov 06 '14 at 02:49