Adding a new column in a Dataframe, with certain conditions

Question

I am very new to python,and have started working on text data.

I want add a column in the dataframe, compare it with a condition mentioned in a different column and fill it accordingly.

The dataset was of 10000 rows, I shortened it by taking out random sample of 2000 rows.

I want to include new column named " Review Sentiment " and fill the cells in it as 1 if review.rating is >3 and 0 if review.rating is =< 3.

Here is what I have tried.

Code:

Dataset = pd.read_csv('Datafiniti_Hotel_Reviews.csv')

Dataset_sample = Dataset.sample(n = 2000)
Dataset_sample.head()

i=0

for i in range(len(Dataset_sample.axes[0])):
            if(Dataset_sample['reviews.rating'] < 3):
                Dataset_sample.insert(len(Dataset_sample.axes[1],"Test",1))
            else:
                Dataset_sample.insert(len(Dataset_sample.axes[1],"Test",0))

Error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

enter image description here

Dataset: Extract from the dataset. Kindly help using these columns from the dataset. The logic would remain the same.

 ID   province reviews.rating 
 ----------------------------  
 1    CA             5
 7    ST             4
 3    DL             4
 6    YT             5
 5    JD             1

Please post a sample of data which can be copied, not an image. — NYC Coder, May 22 '20 at 21:21
`Dataset_sample['Test'] = Dataset_sample['reviews.rating'].lt(3).astype(int)`. — Quang Hoang, May 22 '20 at 21:21
Also, you may want to do `Dataset_sample = Dataset.sample(n=2000).copy()`. — Quang Hoang, May 22 '20 at 21:22
Please [provide a reproducible copy of the DataFrame with `df.head(10).to_clipboard(sep=',')`](https://stackoverflow.com/questions/52413246/how-to-provide-a-copy-of-your-dataframe-with-to-clipboard). [Stack Overflow Discourages Screenshots](https://meta.stackoverflow.com/questions/303812/discourage-screenshots-of-code-and-or-errors). It is likely the question will be down-voted. You are discouraging assistance because no one wants to retype your data or code, and screenshots are often illegible. — Trenton McKinney, May 22 '20 at 21:36

score 0 · Answer 1 · answered May 23 '20 at 09:27

import pandas as pd

# Data

dfBuses = pd.DataFrame({'size': [40,30], 'cost': [500,400]},
                      index = ['bus1', 'bus2'], columns=['size','cost']) 

print(dfBuses)

dfBuses['expensive']=[(row['cost']>=450)  for i,row in dfBuses.iterrows()]

print(dfBuses)

gives

      size  cost
bus1    40   500
bus2    30   400
      size  cost  expensive
bus1    40   500       True
bus2    30   400      False

Adding a new column in a Dataframe, with certain conditions

1 Answers1