Checking for Specific Value in a Pandas Column and performing further operation

Question

I have a pandas DataFrame, which has a column named is_retweeted. The values in this column are either Yes or No. If, the value is 'Yes', I want to go ahead performing X type sentiment analysis (the code for which I have). Else-if value is No, I want to go ahead performing Y type sentiment analysis (again, the code for which I have)

But, I am unable to check for this condition. I get the same error seen here. No solution here is helping for my usecase.

Based on what is suggested here if I do: s = 'Yes' in tweet_df.is_retweeted print(s)
I get False as output.

This is what the dataframe looks like (for ease of representation I havent displayed other columns here):

tweet_dt is_retweeted
2020-09-01 No
2020-09-01 No
2020-09-01 Yes

I want to perform below sorta operation based on the value in 'is_retweeted' column:

retweets_nerlst = []
while tweet_df['is_retweeted'] == 'Yes':
  for index, row in tqdm(tweet_df.iterrows(), total=tweet_df.shape[0]):
    cleanedTweet = row['tweet'].replace("#", "")
    sentence = Sentence(cleanedTweet, use_tokenizer=True)

PS: My codebase can be seen here

Please include a [mcve] in the text of your question, not as a link — G. Anderson, Sep 01 '20 at 19:54

score 0 · Answer 1 · answered Sep 01 '20 at 20:31

I think you can do it with np.where:

import pandas as pd
import numpy as np


def SentimentX(text):
    #your SentimentX code
    return f"SentimentX_result of {text}"

def SentimentY(text):
    #your SentimentY code
    return f"SentimentY_result of {text}"

data={"date":["2020-09-01","2020-09-02","2020-09-03"], "is_retweeted":["No","No","Yes"],'text':['text1','text2','text3']}

df=pd.DataFrame(data)

df['sentiment']=np.where(df["is_retweeted"]=="Yes",df['text'].apply(SentimentX),df['text'].apply(SentimentY))
print(df)

result:

         date is_retweeted   text                   sentiment
0  2020-09-01           No  text1  SentimentY_result of text1
1  2020-09-02           No  text2  SentimentY_result of text2
2  2020-09-03          Yes  text3  SentimentX_result of text3

score 0 · Answer 2 · answered Sep 01 '20 at 20:43

If I understand your question correctly, you can use apply with a condition:

tweet_df['result'] = tweet_df.apply(lambda x: sentiment_x(x.text) if x.is_retweeted =='Yes' else sentiment_y(x.text), axis = 1)

This is given that your dataframe contains "text" column which you are trying to do Sentiment analysis on, and your sentiment functions return a value to be stored in a new column which I have called "result".

Checking for Specific Value in a Pandas Column and performing further operation

2 Answers2