I have a link, and within that link, I have some products. Within each of these products, there is a table of specifications. The table is such that first column should be the header, and second column the data corresponding to it. The first column for each of these tables is different, with some overlapping categories. I want to get one big table that has all these categories, and in rows, the different products. I am able to get data for one table (one product) as follows:
import requests
import pandas as pd
import xlsxwriter
import csv
from lxml import html
from bs4 import BeautifulSoup
url= "https://www.1800cpap.com/resmed-airfit-n30-nasal-cpap-mask-with-headgear"
source_code= requests.get(url)
plain_text= source_code.text
soup= BeautifulSoup(plain_text, 'html.parser')
table= soup.find("table", {"class":"table"})
print(table)
output_rows=[]
table_rows= table.find_all('tr')
#print(table_rows)
headers = [td.text for td in soup.select_one('.table').select('td:nth-of-type(1)')]
with open("data.csv", "w", encoding="utf-8-sig", newline='') as csv_file:
w = csv.writer(csv_file, delimiter = ",", quoting=csv.QUOTE_MINIMAL)
w.writerow(headers)
for table in soup.select('table'):
w.writerow([td.text for td in table.select('td:nth-of-type(2)')])
I understand for different products I will have to loop over the link to eac product, and I am able to do that. However, how do I append each table to the previous output such that the required table structure is maintained?