0

I have this 2 by 2 HTML table (header + 1 row at first).

<table id="table">
<thead><tr><th> 1 </th><th> 2 </th></tr></thead>
<tbody><tr><td> body1 </th><td> body2 </th></td></tbody>
</table>

I added JavaScript functions to buttons on the page that will add and delete rows if the user chooses to. My question is: how can I store the information on this variable-size table into Python variables. I am trying to scrape/parse it, but I haven't been able to collect the information in order to store it. I've tried 2 different approaches:

url = "http://myprojecturl.com" # assume this is correct
page = urlopen(url)
html_bytes = page.read()
html = html_bytes.decode("utf-8")
table = html.find("table") 
print(table)

returns -1

html_file = open("http://myprojecturl.com")
html_content = html_file.read()
parsed_html = etree.HTML(html_content)
print(html_content)
html_tables = parsed_html.findall("table/tbody") # i've tried adding "<>" to this parameter and using other tags 

returns an empty list

I have reason to believe the path is correct, and maybe the types being returned are different than what I print. Any suggestions on a correction or different approach are much appreciated.

  • You could use `BeautifulSoup` https://stackoverflow.com/a/23377804/7942856 – PacketLoss Apr 13 '21 at 05:32
  • why are you trying to do this anyway? why not send the data to your python side via ajax in button event handler ? – aSaffary Apr 13 '21 at 05:34
  • @aSaffaryThis is for the final project of a course I'm taking. I'm not familiar with JavaScript and I am somewhat new to coding. Thanks for the suggestion, it looks like this will be the ideal – Augusto Barbosa Arraes Apr 14 '21 at 01:10

0 Answers0