1

I want to select certain text that comes after an hr node with selenium and xpath. But I keep getting a WebDriverException

Here is the html code I want to extract the text from: html snippet

The text I want to get is: Introduction to financial ... business decisions

I used this code:

e = c.find_element_by_xpath("//div[@class='ajaxcourseindentfix']/hr/following-sibling::text()")

The problem is that I keep getting this exception

selenium.common.exceptions.WebDriverException: Message: TypeError: Expected an element or WindowProxy, got: [object Text] {}

What should I do?

YACINE GACI
  • 145
  • 2
  • 13
  • Does `e = c.find_elements_by_css_selector("div.ajaxcourseindentfix").getText()` works? – Naramsim Feb 09 '18 at 09:56
  • Update your question with HTML code sample for `div` node as text – Andersson Feb 09 '18 at 09:57
  • @Naramsim , there is no built-in `getText()` method in Python applicable to list – Andersson Feb 09 '18 at 09:59
  • `e = c.find_elements_by_css_selector("div.ajaxcourseindentfix").text` sorry – Naramsim Feb 09 '18 at 10:03
  • @Naramsim, you're still trying to get text from list of elements. Anyway, even applied to single WebElement `text` property should return `"ACCT 200..."` and `"Credit hours"` text nodes which OP wants to skip... – Andersson Feb 09 '18 at 10:21
  • try `e = c.find_element_by_xpath("//div[@class='ajaxcourseindentfix']/hr/following-sibling::text()[1]").text` – xruptronics Feb 09 '18 at 10:32
  • @xruptronics That doesn't work either, I already tried it. If you want the link to the website, here it is: http://catalog.utk.edu/content.php?filter%5B27%5D=ACCT&filter%5B29%5D=&filter%5Bcourse_type%5D=-1&filter%5Bkeyword%5D=&filter%5B32%5D=1&filter%5Bcpage%5D=1&cur_cat_oid=16&expand=&navoid=1721&search_database=Filter#acalog_template_course_filter – YACINE GACI Feb 09 '18 at 10:39
  • Do not use following sibling , (//div[@class='ajaxcourseindentfix']/hr)[1] – Pradeep hebbar Feb 09 '18 at 13:00
  • Does this answer your question? [How to get text of an element in Selenium WebDriver, without including child element text?](https://stackoverflow.com/questions/12325454/how-to-get-text-of-an-element-in-selenium-webdriver-without-including-child-ele) – Pikamander2 Apr 15 '20 at 04:00

2 Answers2

1

In selenium you cannot use XPath that returns attributes or text nodes, so /text() syntax is not allowed. If you want to get specific child text node (nodes) only instead of complete text content (returned by text property), you might execute complex JavaScript

I tried to implement solution from this question and it seem to work, so you can apply below code to get required text node:

driver.execute_script("""var el = document.createElement( 'html' );
                         el.innerHTML = '<div>' + document.querySelector('div.ajaxcourseindentfix').innerHTML.split('<hr>')[1];
                         return el.querySelector( 'div' ).textContent;""")

The output is

Introduction to financial and managerial accounting theory and practice with emphasis on the role of accounting information in business decisions.
Andersson
  • 51,635
  • 17
  • 77
  • 129
0

HTML has 3 types node: Element/Attribute/Text Node, Selenium's findElement require Element Node as return value.

In your XPath text() will select Text Node, that's why you get that error.

But we can use javascript to interact with Text Node.

script = """
    var text = '';

    var childNodes = arguments[0].childNodes; // child nodes includes Element and Text Node

    childNodes.forEach(function(it, index){
      if(it.nodeName.toUpperCase() === 'HR') { // iterate until Element Node: hr
        text = childNodes[index+1].textContent; 
        // get the text content of next Child Node of Element Node: hr
      }
    });
    return text;
"""
ele = driver.find_elements_by_css_selector("div.ajaxcourseindentfix")
text = driver.execute_script(script, ele)
print text
yong
  • 13,357
  • 1
  • 16
  • 27