I am trying to check the dates/times availability for an exam using Python mechanize and send someone an email if a particular date/time becomes available in the result (result page screenshot attached)
import mechanize
from BeautifulSoup import BeautifulSoup
URL = "http://secure.dre.ca.gov/PublicASP/CurrentExams.asp"
br = mechanize.Browser()
response = br.open(URL)
# there are some errors in doctype and hence filtering the page content a bit
response.set_data(response.get_data()[200:])
br.set_response(response)
br.select_form(name="entry_form")
# select Oakland for the 1st set of checkboxes
for i in range(0, len(br.find_control(type="checkbox",name="cb_examSites").items)):
if i ==2:
br.find_control(type="checkbox",name="cb_examSites").items[i].selected =True
# select salesperson for the 2nd set of checkboxes
for i in range(0, len(br.find_control(type="checkbox",name="cb_examTypes").items)):
if i ==1:
br.find_control(type="checkbox",name="cb_examTypes").items[i].selected =True
reponse = br.submit()
print reponse.read()
I am able to get the response but for some reason the data within my table is missing
here are the buttons from the initial html page
<input type="submit" value="Get Exam List" name="B1">
<input type="button" value="Clear" name="B2" onclick="clear_entries()">
<input type="hidden" name="action" value="GO">
one part of the output (submit response) where the actual data is lying
<table summary="California Exams Scheduling" class="General_list" width="100%" cellspacing="0"> <EVERTHING INBETWEEN IS MISSING HERE>
</table>
All the data within the table is missing. I have provided a screenshot of the table element from chrome browser.
- Can someone please tell me what could be wrong ?
Can someone please tell me how to get the date/time out of the response (assuming I have to use BeautifulSoup) and so has to be something on these lines. I am trying to find out if a particular date I have in mind (say March 8th) in the response shows up a Begin Time of 1:30 pm..screenshot attached
soup = BeautifulSoup(response.read()) print soup.find(name="table")
update - looks like my issue might be related to this question and am trying my options . I tried the below as per one of the answers but cannot see any tr elements in the data (though can see this in the page source when I check it manually)
soup.findAll('table')[0].findAll('tr')
Update - Modfied this to use selenium, will try and do further at some point soon
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import urllib3
myURL = "http://secure.dre.ca.gov/PublicASP/CurrentExams.asp"
browser = webdriver.Firefox() # Get local session of firefox
browser.get(myURL) # Load page
element = browser.find_element_by_id("Checkbox5")
element.click()
element = browser.find_element_by_id("Checkbox13")
element.click()
element = browser.find_element_by_name("B1")
element.click()