I'm learning to use Selenium for web scraping. I have a couple of questions with the website I'm working with:
-The website has multiple pages to go over and I can't seem to find a way to locate the pages' paths and go over them. For example, the following code returns link_page
as NoneType
.
from selenium import webdriver
import time
driver = webdriver.Chrome('chromedriver')
driver.get('https://www.oddsportal.com/soccer/england/premier-league')
time.sleep(0.5)
results_button = driver.find_element_by_xpath('/html/body/div[1]/div/div[2]/div[6]/div[1]/div/div[1]/div[2]/div[1]/div[2]/ul/li[3]/span')
results_button.click()
time.sleep(3)
season_button = driver.find_element_by_xpath('/html/body/div[1]/div/div[2]/div[6]/div[1]/div/div[1]/div[2]/div[1]/div[3]/ul/li[2]/span/strong/a')
season_button.click()
link_page = driver.find_element_by_xpath('/html/body/div[1]/div/div[2]/div[6]/div[1]/div/div[1]/div[2]/div[1]/div[6]/div/a[3]/span').get_attribute('href')
print(link_page.text)
driver.get(link_page)
-For some reason I have to use the results_button
to be able to get the href
of matches. For example, the following code tries to go the page directy (as an attempt to circumvent problem 1 above), but the link_page
returns a NoSuchElementException
error.
from selenium import webdriver
import time
driver = webdriver.Chrome('chromedriver')
driver.get('https://www.oddsportal.com/soccer/england/premier-league/results/#/page/2')
time.sleep(3)
link_page = driver.find_element_by_xpath('/html/body/div[1]/div/div[2]/div[6]/div[1]/div/div[1]/div[2]/div[1]/div[6]/table/tbody/tr[11]/td[2]/a').get_attribute('href')
print(link_page.text)
driver.get(link_page)