There is a page with tables of statistics I'm trying to pull.
The page has the default year as 2020, with a dropdown box to select different years. I wrote this code to select 2009.
from selenium import webdriver as wd
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from pandas.io.html import read_html
from selenium.webdriver.support.ui import Select
import numpy as np
import re
import pandas as pd
driver=wd.Chrome()
driver.implicitly_wait(10)
driver.get('https://www.cpbl.com.tw/standings/history')
select = Select(driver.find_element_by_id('Year'))
# select by visible text
select.select_by_visible_text('2009')
button = driver.find_elements_by_xpath("//input[@value='查詢']")[0]
button.click()
main=driver.find_elements_by_xpath('//*[(@id = "PageListContainer")]')[0]
main_att=main.get_attribute('innerHTML')
tab=pd.read_html(main_att)
I purposely didn't say driver.close() to leave the browser open, so I can look at it, and apparently the selection of 2009 worked. The browser had tables for 2009. However, the data my code pulled was still from the default year (2020). I want data from 2009. Anyone know why?
I am using Python 3.7 and Spyder 4.0.1