The page is quite dynamic (and terribly slow, at least on my side), involves JavaScript and multiple asynchronous requests to get the data. Approaching that with requests
would not be easy and you might need to fall into using browser automation via, for example, selenium
.
Here is something for you to get started. Note the use of Explicit Waits here and there:
import pandas as pd
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.select import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox()
driver.maximize_window()
driver.get("http://quotes.freerealtime.com/dl/frt/M?IM=quotes&type=Time%26Sales&SA=quotes&symbol=IBM&qm_page=45750")
wait = WebDriverWait(driver, 400) # 400 seconds timeout
# wait for select element to be visible
time_select = Select(wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "select[name=time]"))))
# select 9:30 and go
time_select.select_by_visible_text("09:30")
driver.execute_script("arguments[0].click();", driver.find_element_by_id("go"))
time.sleep(2)
while True:
# wait for the table to appear and load to pandas dataframe
table = wait.until(EC.presence_of_element_located((By.ID, "qmmt-time-and-sales-data-table")))
df = pd.read_html(table.get_attribute("outerHTML"))
print(df[0])
# wait for offset select to be visible and forward it 1 min
offset_select = Select(wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "select[name=timeOffset]"))))
offset_select.select_by_value("1")
time.sleep(2)
# TODO: think of a break condition
Note that this works really, really slow on my machine and I am not sure how well it would run on yours, but it continuously advances 1 minute forward in an endless loop (you would probably need to stop it at some point).