I am very new to python,and have started working on text data.
I want add a column in the dataframe, compare it with a condition mentioned in a different column and fill it accordingly.
The dataset was of 10000 rows, I shortened it by taking out random sample of 2000 rows.
I want to include new column named " Review Sentiment " and fill the cells in it as 1 if review.rating is >3 and 0 if review.rating is =< 3.
Here is what I have tried.
Code:
Dataset = pd.read_csv('Datafiniti_Hotel_Reviews.csv')
Dataset_sample = Dataset.sample(n = 2000)
Dataset_sample.head()
i=0
for i in range(len(Dataset_sample.axes[0])):
if(Dataset_sample['reviews.rating'] < 3):
Dataset_sample.insert(len(Dataset_sample.axes[1],"Test",1))
else:
Dataset_sample.insert(len(Dataset_sample.axes[1],"Test",0))
Error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Dataset: Extract from the dataset. Kindly help using these columns from the dataset. The logic would remain the same.
ID province reviews.rating
----------------------------
1 CA 5
7 ST 4
3 DL 4
6 YT 5
5 JD 1