0

I obtained the adjectives using this function:

def getAdjectives(text):

    blob=TextBlob(text)
    return [ word for (word,tag) in blob.tags if tag == "JJ"]

dataset['adjectives'] = dataset['text'].apply(getAdjectives)`

I obtained the dataframe from a json file using this code:

with open('reviews.json') as project_file:    
    data = json.load(project_file)
dataset=pd.json_normalize(data) 
print(dataset.head())

i have done the sentiment analysis for the dataframe using this code:

dataset[['polarity', 'subjectivity']] = dataset['text'].apply(lambda text: pd.Series(TextBlob(text).sentiment))
print(dataset[['adjectives', 'polarity']])

this is the output:


                                          adjectives  polarity
0                                                 []  0.333333
1  [right, mad, full, full, iPad, iPad, bad, diff...  0.209881
2                             [stop, great, awesome]  0.633333
3                                          [awesome]  0.437143
4                        [max, high, high, Gorgeous]  0.398333
5                                     [decent, easy]  0.466667
6  [it’s, bright, wonderful, amazing, full, few...  0.265146
7                                       [same, same]  0.000000
8         [old, little, Easy, daily, that’s, late]  0.161979
9                       [few, huge, storage.If, few]  0.084762

I have tried to filter the adjectives so as to determine those with positive, neutral and negative polarity in this code:

if dataset['polarity']> 0:
    print(dataset[['adjectives', 'polarity']], "Positive")
        
elif dataset['polarity'] == 0:
    print(dataset[['adjectives', 'polarity']], "Neutral")   
else: 
        print(dataset[['adjectives', 'polarity']], "Negative")

I got the error:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Kindly help.

Madea
  • 25
  • 4
  • 1
    Is [this](https://stackoverflow.com/questions/17071871/how-do-i-select-rows-from-a-dataframe-based-on-column-values) what you are trying to do? – ramzeek Mar 17 '22 at 14:51
  • Yes, but i have tried it using the "mask" method and the output is mixed up and incorrect – Madea Mar 17 '22 at 15:52
  • the code ```df=dataset mask = df['polarity'].values > 0 print(dataset[['adjectives', 'polarity']], "Positive") mask = df['polarity'].values == 0 print(dataset[['adjectives', 'polarity']], "Neutral") mask = df['polarity'].values < 0 print(dataset[['adjectives', 'polarity']], "Negative") ``` – Madea Mar 17 '22 at 15:53
  • the output ```few 0.084762 Positive adjectives polarity 0 NaN 0.333333 1 right 0.209881 1 mad 0.209881 1 full 0.209881 1 full 0.209881 1 iPad 0.209881 1 iPad 0.209881 1 bad 0.209881 1 different 0.209881 1 wonderful 0.209881 1 much 0.209881 1 affordable 0.209881 2 stop 0.633333 2 great 0.633333 2 awesome 0.633333 3 awesome 0.437143 4 max 0.398333 4 high 0.398333 4 high 0.398333 ``` the output is too long and incorrect – Madea Mar 17 '22 at 15:55

1 Answers1

0

Try using np.select to determine the sentiment based on the polarity:

df['sentiment'] = np.select(
    [
        dataset['polarity'] > 0,
        dataset['polarity'] == 0
    ],
    [
        "Positive",
        "Neutral"
    ],
    default="Negative"
)

One-liner:

df['sentiment'] = np.select([dataset['polarity'] > 0, dataset['polarity'] == 0], ["Positive", "Neutral"], "Negative")