1

I am trying to extract on the number from this html element:

<td bgcolor="green">
    <font color="white">
        "49.8 "
        <small>dBmV</small>
    </font>
</td>

How do only extract the 49.8 without getting the bBmV also?

I am able to use the xpath on to return the all of 49.8 dbmv but when searching the xpath of just "49.8" I receive error

Error:

invalid selector: The result of the xpath expression "/html/body/p[1]/table/tbody/tr/td/table[2]/tbody/tr[2]/td[4]/font/text()" is: [object Text]. It should be an element. 

I have tried:

browser.find_element_by_xpath("/html/body/p[1]/table/tbody/tr/td/table[2]/tbody/tr[2]/td[4]/font").text

which returns 49.8 dBmV

And then:

browser.find_element_by_xpath("/html/body/p[1]/table/tbody/tr/td/table[2]/tbody/tr[2]/td[4]/font/text()").text

returns the exception above.

I just want the number 49.8 (which changes obviously). i know i could extract the number later but im hoping there something I can use to just to get the details directly from the html, something a bit tidier

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352

4 Answers4

2

To extract the text 49.8 you can use the following Locator Strategy:

  • Using xpath through execute_script() and textContent:

    print(driver.execute_script('return arguments[0].firstChild.textContent;', driver.find_element_by_xpath("//td[@bgcolor='green']/font[@color='white']")).strip())
    
  • Using xpath through splitlines() and get_attribute():

    print(driver.find_element_by_xpath("//td[@bgcolor='green']/font[@color='white']").get_attribute("innerHTML").splitlines()[1])
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • 1
    GOod one! did not think of `splitlines()`! – Moshe Slavin Jun 20 '19 at 08:24
  • 1
    Thats done it! print(driver.execute_script('return arguments[0].firstChild.textContent;', driver.find_element_by_xpath("//td[@bgcolor='green']/font[@color='white']")).strip()) is working for me. Thanks for your help! – Glenn Davies Jun 20 '19 at 08:28
1

You can use the first line and just get the number like this:

text_num = browser.find_element_by_xpath("/html/body/p[1]/table/tbody/tr/td/table[2]/tbody/tr[2]/td[4]/font").text
print(float(text_num.split()[0]))

Hope this helped!

1

You can replace the extra text like this:

first_text = browser.find_element_by_xpath("/html/body/p[1]/table/tbody/tr/td/table[2]/tbody/tr[2]/td[4]/font").text
second_text = browser.find_element_by_xpath("/html/body/p[1]/table/tbody/tr/td/table[2]/tbody/tr[2]/td[4]/font/small").text
only_first_text = first_text.replace(second_text, '')
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
Moshe Slavin
  • 5,127
  • 5
  • 23
  • 38
  • Hmm yeah that would work but still hoping to be able to extract the number directly without another line to delete the text. Can i search element ignoring the /small part? – Glenn Davies Jun 20 '19 at 08:20
0

The find_element_by_xpath API in Selenium only supports returning elements, so eventhough it's possible in XPath to specify an expression that would return just the text that you're looking for it won't be possible in this case with XPath only.

bertilnilsson
  • 304
  • 1
  • 4
  • The comment makes a lot of sense, thanks, but ive tried that line and still getting an error, although now it gives unable to locate: NoSuchElementException: no such element: Unable to locate element: {"method":"xpath","selector":"/html/body/p[1]/table/tbody/tr/td/table[2]/tbody/tr[2]/td[4]/font/text()[0]"} I tried several variations as well, trying to make and work but getting no such element or invalid selector exceptions – Glenn Davies Jun 20 '19 at 08:00
  • @GlennDavies Sorry, I was looking at the Xpath without considering the selenium context properly. The `find_element_by_xpath` only supports returning elements, it won't work with Xpaths that return anything else. I will update my answer now. – bertilnilsson Jun 20 '19 at 08:20