selecting text node in selenium python with xpath

Question

I want to select certain text that comes after an hr node with selenium and xpath. But I keep getting a WebDriverException

Here is the html code I want to extract the text from: html snippet

The text I want to get is: Introduction to financial ... business decisions

I used this code:

e = c.find_element_by_xpath("//div[@class='ajaxcourseindentfix']/hr/following-sibling::text()")

The problem is that I keep getting this exception

selenium.common.exceptions.WebDriverException: Message: TypeError: Expected an element or WindowProxy, got: [object Text] {}

What should I do?

Does `e = c.find_elements_by_css_selector("div.ajaxcourseindentfix").getText()` works? — Naramsim, Feb 09 '18 at 09:56
Update your question with HTML code sample for `div` node as text — Andersson, Feb 09 '18 at 09:57
@Naramsim , there is no built-in `getText()` method in Python applicable to list — Andersson, Feb 09 '18 at 09:59
`e = c.find_elements_by_css_selector("div.ajaxcourseindentfix").text` sorry — Naramsim, Feb 09 '18 at 10:03
@Naramsim, you're still trying to get text from list of elements. Anyway, even applied to single WebElement `text` property should return `"ACCT 200..."` and `"Credit hours"` text nodes which OP wants to skip... — Andersson, Feb 09 '18 at 10:21
try `e = c.find_element_by_xpath("//div[@class='ajaxcourseindentfix']/hr/following-sibling::text()[1]").text` — xruptronics, Feb 09 '18 at 10:32
@xruptronics That doesn't work either, I already tried it. If you want the link to the website, here it is: http://catalog.utk.edu/content.php?filter%5B27%5D=ACCT&filter%5B29%5D=&filter%5Bcourse_type%5D=-1&filter%5Bkeyword%5D=&filter%5B32%5D=1&filter%5Bcpage%5D=1&cur_cat_oid=16&expand=&navoid=1721&search_database=Filter#acalog_template_course_filter — YACINE GACI, Feb 09 '18 at 10:39
Do not use following sibling , (//div[@class='ajaxcourseindentfix']/hr)[1] — Pradeep hebbar, Feb 09 '18 at 13:00
Does this answer your question? [How to get text of an element in Selenium WebDriver, without including child element text?](https://stackoverflow.com/questions/12325454/how-to-get-text-of-an-element-in-selenium-webdriver-without-including-child-ele) — Pikamander2, Apr 15 '20 at 04:00

Andersson · Answer 1 · 2018-02-09T13:35:43.477

In selenium you cannot use XPath that returns attributes or text nodes, so /text() syntax is not allowed. If you want to get specific child text node (nodes) only instead of complete text content (returned by text property), you might execute complex JavaScript

I tried to implement solution from this question and it seem to work, so you can apply below code to get required text node:

driver.execute_script("""var el = document.createElement( 'html' );
                         el.innerHTML = '<div>' + document.querySelector('div.ajaxcourseindentfix').innerHTML.split('<hr>')[1];
                         return el.querySelector( 'div' ).textContent;""")

The output is

Introduction to financial and managerial accounting theory and practice with emphasis on the role of accounting information in business decisions.

score 0 · Answer 2 · answered Feb 09 '18 at 13:48

HTML has 3 types node: Element/Attribute/Text Node, Selenium's findElement require Element Node as return value.

In your XPath text() will select Text Node, that's why you get that error.

But we can use javascript to interact with Text Node.

script = """
    var text = '';

    var childNodes = arguments[0].childNodes; // child nodes includes Element and Text Node

    childNodes.forEach(function(it, index){
      if(it.nodeName.toUpperCase() === 'HR') { // iterate until Element Node: hr
        text = childNodes[index+1].textContent; 
        // get the text content of next Child Node of Element Node: hr
      }
    });
    return text;
"""
ele = driver.find_elements_by_css_selector("div.ajaxcourseindentfix")
text = driver.execute_script(script, ele)
print text

selecting text node in selenium python with xpath

2 Answers2

Linked