1

Hello I am scraping this page https://www.betexplorer.com/soccer/china/super-league-2016/beijing-guoan-henan-jianye/S49KzkvO/ I have to scraper these data enter image description here

Country = driver.find_element_by_xpath("/html/body/div[4]/div[4]/div/div/div[1]/section/ul[1]/li[3]/a").text
leagueseason = driver.find_element_by_xpath("/html/body/div[4]/div[4]/div/div/div[1]/section/header/h1/a").text
Home = driver.find_element_by_xpath("/html/body/div[4]/div[4]/div/div/div[1]/section/ul[2]/li[1]/h2/a").text
Away = driver.find_element_by_xpath("/html/body/div[4]/div[4]/div/div/div[1]/section/ul[2]/li[3]/h2/a").text

I tried with these XPATH but I would adapt with a more specific XPath, because that could be change. Any suggestion? Thanks

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
vinicush13
  • 13
  • 2

1 Answers1

0

To print the innerText of the elements you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

  • Using and get_attribute("innerHTML"):

    • China:

      print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "ul.list-breadcrumb li:nth-child(3) a"))).get_attribute("innerHTML"))
      
    • Super League 2016:

      print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "h1.wrap-section__header__title>a"))).get_attribute("innerHTML"))
      
    • Beijing Guoan:

      print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "ul.list-details>li:first-child h2.list-details__item__title>a"))).get_attribute("innerHTML"))
      
    • Henan Jianye:

      print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "ul.list-details>li:nth-child(3) h2.list-details__item__title>a"))).get_attribute("innerHTML"))
      
  • Using and text attribute:

    • China:

      print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//ul[@class='list-breadcrumb']//following::li[3]//a"))).text)
      
    • Super League 2016:

      print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h1[@class='wrap-section__header__title']/a"))).text)
      
    • Beijing Guoan:

      print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//ul[@class='list-details']//following::li[1]//h2/a"))).text)
      
    • Henan Jianye:

      print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//ul[@class='list-details']//following::li[2]//h2/a"))).text)
      
    • Note : You have to add the following imports :

      from selenium.webdriver.support.ui import WebDriverWait
      from selenium.webdriver.common.by import By
      from selenium.webdriver.support import expected_conditions as EC
      

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python


Outro

Link to useful documentation:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352