Anyone please helps. Please point out where i am wrong when the extracted reviews are written into 3 separate columns in hotelreview.csv, how can i fix this in order to write them into 1 column? and how to add the heading name "review" for it based on the codes below. And I also want to add the new extracted data ("review" column) into the existing csv 'hotel_FortWorth.csv'. I just added the extracted information into a new csv, i don't know how to combine 2 files together or any other ways? the url can be repeated to match the reviews. Please! Thank you!
File 'hotel_FortWorth.csv' has 3 columns, for example:
Name link
1 Omni Fort Worth Hotel https://www.tripadvisor.com.au/Hotel_Review-g55857-d777199-Reviews-Omni_Fort_Worth_Hotel-Fort_Worth_Texas.html
2 Hilton Garden Hotel https://www.tripadvisor.com.au/Hotel_Review-g55857-d2533205-Reviews-Hilton_Garden_Inn_Fort_Worth_Medical_Center-Fort_Worth_Texas.html
3......
...
I used the urls from existing csv to extract the reviews, the codes as shown:
import requests
from unidecode import unidecode
from bs4 import BeautifulSoup
import pandas as pd
file = []
data = pd.read_csv('hotel_FortWorth.csv', header = None)
df = data[2]
for url in df[1:]:
print(url)
thepage = requests.get(url).text
soup = BeautifulSoup(thepage, "html.parser")
resultsoup = soup.find_all("p", {"class": "partial_entry"})
file.extend(resultsoup)
with open('hotelreview.csv', 'w', newline='') as fid:
for review in file:
review_list = review.get_text()
fid.write(unidecode(review_list+'\n'))
Expected result:
name link review
1 ... ... ...
2
....