I was trying to scrape a website for some university project. The website is https://www.bonprix.it/prodotto/leggings-a-pinocchietto-pacco-da-2-leggings-a-pinocchietto-pacco-da-2-bianco-nero-956015/?itemOptionId=12211813. I have a problem with my python code. What I want to obtain is all the reviews for the pages from 1 to 5, but instead I get all [].Any help would be appreciated!
Here is the code:
import csv
from bs4 import BeautifulSoup
import urllib.request
import re
import pandas as pd
import requests
reviewlist = []
class AppURLopener(urllib.request.FancyURLopener):
version = "Mozilla/5.0"
opener = AppURLopener()
response = opener.open('https://www.bonprix.it/prodotto/leggings-a-pinocchietto-pacco-da-2-leggings-a-pinocchietto-pacco-da-2-bianco-nero-956015/?itemOptionId=12211813')
soup = BeautifulSoup(response,'html.parser')
reviews = soup.find_all('div',{'class':'reviewContent'})
for i in reviews:
review = {
'per_review_name' : i.find('span',{'itemprop':'name'}).text.strip(),
'per_review' : i.find('p',{'class':'reviewText'}).text.strip(),
'per_review_taglia' : i.find('p',{'class':'singleReviewSizeDescr'}).text.strip(),
}
reviewlist.append(review)
for page in range (1,5):
prova = soup.find_all('div',{'data-page': '{page}'})
print(prova)
print(len(reviewlist))
df = pd.DataFrame(reviewlist)
df.to_csv('list.csv',index=False)
print('Fine.')
And here the output that I get:
[]
5
[]
5
[]
5
[]
5
Fine.