I have this 2 by 2 HTML table (header + 1 row at first).
<table id="table">
<thead><tr><th> 1 </th><th> 2 </th></tr></thead>
<tbody><tr><td> body1 </th><td> body2 </th></td></tbody>
</table>
I added JavaScript functions to buttons on the page that will add and delete rows if the user chooses to. My question is: how can I store the information on this variable-size table into Python variables. I am trying to scrape/parse it, but I haven't been able to collect the information in order to store it. I've tried 2 different approaches:
url = "http://myprojecturl.com" # assume this is correct
page = urlopen(url)
html_bytes = page.read()
html = html_bytes.decode("utf-8")
table = html.find("table")
print(table)
returns -1
html_file = open("http://myprojecturl.com")
html_content = html_file.read()
parsed_html = etree.HTML(html_content)
print(html_content)
html_tables = parsed_html.findall("table/tbody") # i've tried adding "<>" to this parameter and using other tags
returns an empty list
I have reason to believe the path is correct, and maybe the types being returned are different than what I print. Any suggestions on a correction or different approach are much appreciated.