Using XPath Selector 'following-sibling::text()' in Selenium (Python)

Question

I'm trying to use Selenium (in Python) to extract some information from a website. I've been selecting elements with XPaths but am having trouble using the following-sibling selector. The HTML is as follows:

<span class="metadata">
    <strong>Photographer's Name: </strong>
    Ansel Adams
</span>

I can select "Photographer's Name" with

In [172]: metaData = driver.find_element_by_class_name('metadata')

In [173]: metaData.find_element_by_xpath('strong').text
Out[173]: u"Photographer's Name:"

I'm trying to select the section of text after the tag ('Ansel Adams' in the example). I assumed I could use the following-sibling selector but I receive the following error:

In [174]: metaData.find_element_by_xpath('strong/following-sibling::text()')
ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line statement', (328, 0))
... [NOTE: Omitted the traceback for brevity] ...
InvalidSelectiorException: Message: u'The given selector strong/following-sibling::text() is either invalid or does not result in a WebElement. The following error occurred:\n[InvalidSelectorError] The result of the xpath expression "strong/following-sibling::text()" is: [object Text]. It should be an element.'

Any ideas as to why this isn't working?

score 8 · Accepted Answer · answered Jan 19 '12 at 13:50

8

@RossPatterson is correct. The trouble is that the text 'Ansel Adams' is not a WebElement, so you cannot use find_element or find_elements. If you change your HTML to

<span class="metadata">
    <strong>Photographer's Name: </strong>
    <strong>Ansel Adams</strong>
</span>

then find_element_by_xpath('strong/following-sibling::*[1]').text returns 'Ansel Adams'.

answered Jan 19 '12 at 13:50

shamp00

11,106
4
38
81

1

Unfortunately, I don't have control over the HTML content. It's strange though, since the code works in online [XPath testers]. Well, this leads me to a second question: is it possible to get all of the contents of `` (tags and text)? I can select it by `find_elements_by_class_name('metadata')` but can not figure out how to get both the text with the `` tags intact. – alukach Jan 20 '12 at 03:19
You could always use `driver.page_source` to get the HTML of the whole page, and then use [something other than webdriver to parse it](http://stackoverflow.com/questions/8692/how-to-use-xpath-in-python). – shamp00 Jan 20 '12 at 10:43
Great, I didn't know about `driver.page_source`, this makes my day, thanks! – alukach Jan 21 '12 at 08:53

score 3 · Answer 2 · answered Apr 13 '14 at 00:50

3

This is documented in this Selenium bug report: http://code.google.com/p/selenium/issues/detail?id=5459

"Your xpath doesn't return an element; it returns a text node. While this might have been perfectly acceptable in Selenium RC (and by extension, Selenium IDE), the methods on the WebDriver WebElement interface require an element object, not just any DOM node object. WebDriver is working as intended. To fix the issue, you'd need to change the HTML markup to wrap the text node inside an element, like a ."

answered Apr 13 '14 at 00:50

user2707671

1,694
13
12

Unfortunately it's hard to find actual documentation that documents the intention that "the methods on the WebDriver WebElement interface require an element object, not just any DOM node object," contrary to the case with Selenium RC. I finally found something here: http://seleniumhq.github.io/selenium/docs/api/java/org/openqa/selenium/WebElement.html WebElement, the type returned by findElement, "Represents an HTML element". – LarsH Oct 02 '15 at 14:52

score 2 · Answer 3 · answered Jan 19 '12 at 11:49

2

To get the text "Ansel Adams", just use metaData.get_text(). I don't believe find_element_by_* will allow you to find a text node.

answered Jan 19 '12 at 11:49

Ross Patterson

9,527
33
48

Seems like `metaData.get_text()` would give you `Photographer's Name: Ansel Adams`. According to the documentation at http://release.seleniumhq.org/selenium-remote-control/0.9.2/doc/dotnet/Selenium.ISelenium.GetText.html, "This command uses either the textContent (Mozilla-like browsers) or the innerText (IE-like browsers) of the element, which is the rendered text shown to the user." – LarsH Oct 02 '15 at 14:35

Using XPath Selector 'following-sibling::text()' in Selenium (Python)

3 Answers3

Linked