I am doing a project for British Airlines, and the website is www.airlinequality.com
Please, take a look at my code. It does not return any errors, but it does not scrape anything either.
I think the problem is in <<item.find>> section of the code.
Can someone look at the website? I am really struggling with finding the needed tags and attributes
url = 'https://www.airlinequality.com/airline-reviews/british-airways/page/1/?sortby=post_date%3ADesc&pagesize=100'
def get_soup(url):
r = requests.get('http://localhost:8050', params = {'url':url})
soup = BeautifulSoup(r.text, "lxml")
return soup
reviewlist=[]
def get_reviews(soup):
reviews = soup.find_all('div', {'itemprop':'reviewBody'})
try:
for item in reviews:
reviews = {
'rating': item.find('div', {'itemprop':'reviewRating'}),
'seat_type': item.find('td', {'class':'review-value'}),
'body': item.find('div', {'class':'text_content'}).text.strip(),
'recommended': item.find('td', {'class':'review-rating-header recommended'})
}
reviewlist.append(reviews)
except:
pass
for x in range(1,100):
soup = get_soup(f'https://www.airlinequality.com/airline-reviews/british-airways/page/{x}/?sortby=post_date%3ADesc&pagesize=100')
print(f'Getting page: {x}')
get_reviews(soup)
print(len(reviewlist))
if not soup.find('li', {'class':'off'}):
pass
else:
break
Heading ##Getting page: 1
0
Heading ##Getting page: 2
0
Heading ##Getting page: 3
0
Heading ##Getting page: 4
0
Heading ##Getting page: 5
0
Heading ##Getting page: 6
0
Heading ##Getting page: 7
0
Heading ##Getting page: 8
0
Heading ##Getting page: 9
0
Heading ##Getting page: 10
0