Grabing text using Selenium/XPath/Python

Question

I want to grep the total number of deaths from the Johns Hopkins Covid dashboard. I want to do this using Selenium, Python and Selenium’s chrome driver. The number of deaths can be found under the path //*[@id="ember1915"]/svg/g[2]/svg/text.

This is my script:

from selenium.webdriver import Chrome
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

with Chrome() as driver:
    driver.get('https://coronavirus.jhu.edu/map.html')
    driver.implicitly_wait(20) # Waits for 20 s for the entire page to loads.
    

    diplayElement = driver.find_element_by_xpath('//*[@id="ember1915"]/svg/g[2]/svg/text')

It fails with the error “no such element:

Unable to locate element: {"method":"xpath","selector":"//*[@id="ember1915"]/svg/g[2]/svg/text"}”.

This also happens for other sites I’m trying to scrape.

How can I fix this? What’s the reason for this error?

undetected Selenium · Answer 1 · 2020-09-10T17:39:41.677

The element with the total number of deaths i.e. 905,181 from the Johns Hopkins Covid dashboard is within an <iframe> so you have to:

Induce WebDriverWait for the desired frame to be available and switch to it.

Induce WebDriverWait for visibility_of_element_located() and you can use either of the following Locator Strategies:

Using XPATH and get_attribute():

driver.get('https://coronavirus.jhu.edu/map.html')
WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[@title='Coronavirus COVID-19 Global Cases by Johns Hopkins CSSE']")))
print(WebDriverWait(driver, 60).until(EC.visibility_of_element_located((By.XPATH, "//*[name()='svg']/*[name()='text' and text()='Global Deaths']//following::div[1]/*[name()='svg' and @class='responsive-text-group']//*[name()='g' and @class='responsive-text-label']/*[name()='svg']/*[name()='text']"))).get_attribute("innerHTML"))

Using XPATH and text attribute:

driver.get('https://coronavirus.jhu.edu/map.html')
WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[@title='Coronavirus COVID-19 Global Cases by Johns Hopkins CSSE']")))
print(WebDriverWait(driver, 60).until(EC.visibility_of_element_located((By.XPATH, "//*[name()='svg']/*[name()='text' and text()='Global Deaths']//following::div[1]/*[name()='svg']//*[name()='g']/*[name()='svg']/*[name()='text']"))).text)

Console Output:
```
905,181
```

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python

Reference

You can find a couple of relevant discussions in:

How are you sure that it will return text or inner html without selecting an iframe ? I would like to know even without selecting the shadow element how it will do this. — Dev, Sep 10 '20 at 17:02
@Dev Nice catch, just slipped out while copying the code, corrected it now. — undetected Selenium, Sep 10 '20 at 17:40

Grabing text using Selenium/XPath/Python

1 Answers1

Reference