I'd like to systematically scrape the privacy breach data found here which is directly embedded in the HTML of the page. I've found various links on StackOverflow about missing HTML and not being able to scrape a table using BS4. Both of these threads seem very similar to the issue that I'm having, however i'm having a difficult time reconciling the differences.
Here's my problem: When I pull the HTML using either Requests or urllib (python 3.6) the second table does not appear in the soup. The second link above details that this can occur if the table/data is added in after the page loads using javascript. However when I examine the page source the data is all there, so that doesn't seem to be the issue. A snippet of my code is below.
url = 'https://www.privacyrights.org/data-breach/new?title=&page=1'
r = requests.get(url, verify=False)
soupy = BeautifulSoup(r.content, 'html5lib')
print(len(soupy.find_all('table')))
# only finds 1 table, there should be 2
This code snippet fails to find the table with the actual data in it. I've tried lmxl, html5lib, and html.parse parsers. I've tried urllib and Requests packages to pull down the page.
Why can't requests + BS4 find the table that I'm looking for?