I want to get data from the table on this website: https://www.skyscrapercenter.com/quick-lists#q=&page=1&type=building&status=COM&status=UCT&min_year=0&max_year=9999®ion=0&country=0&city=0 . When I try to read the html content of the table it gives me an empty body, like
<thead>
<tr>
<th width="4%"> <div class="flex">#</div> </th>
<th width="15"> </th>
<th> <div class="flex">Building Name</div> </th>
<th width="15%"> <div class="flex">City</div> </th>
<th width="8%"> <div class="flex">Height m</div> </th>
<th width="8%"> <div class="flex">Floors</div> </th>
<th width="8%"> <div class="flex">Completion</div> </th>
<th width="10%"> <div class="flex">Material</div> </th>
<th width="15%"> <div class="flex">Use</div> </th>
</tr>
</thead>
<tbody>
</tbody>
</table>
Inspect element shows that there is data inside the body, but with my code I can only get information from thead. find_all('tr') only gives me the data from thead and find_all('td') gives nothing. This is my code
import requests
from bs4 import BeautifulSoup
import pandas as pd
url = 'https://www.skyscrapercenter.com/quick-lists#q=&page=1&type=building&status=COM&status=UCT&min_year=0&max_year=9999®ion=0&country=0&city=0'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')
table1 = soup.find('table', id='table-buildings')
headers = []
for i in table1.find_all('th'):
title = i.text
headers.append(title)
mydata = pd.DataFrame(columns = headers)
# Create a for loop to fill mydata
for j in table1.find_all('tr'):
row_data = j.find_all('td')
row = [i.text for i in row_data]
length = len(mydata)
mydata.append = row
mydata
I found this similar post, but the link they use is broken so I can't check it and honestly I don't quite know how to adapt the answer to my own situation, as I'm pretty new to scraping.
Another question I have is how can I access the rows on the next pages, I would like to scrape all 500 results and not just the first 50. Thanks in advance!