3

I use vader for Sentiment Analysis. When I add a single word in addition to the Vader lexicon, it works i.e. it detects the new added word as either positive or negative based on the value I give with the word. Code is below:

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer 
sid_obj = SentimentIntensityAnalyzer() 
new_word = {'counterfeit':-2,'Good':2,}
sid_obj.lexicon.update(new_word)
sentence = "Company Caught Counterfeit." 
sentiment_dict = sid_obj.polarity_scores(sentence) 
tokenized_sentence = nltk.word_tokenize(sentence)
pos_word_list=[]
neu_word_list=[]
neg_word_list=[]

for word in tokenized_sentence:
    if (sid_obj.polarity_scores(word)['compound']) >= 0.1:
        pos_word_list.append(word)
    elif (sid_obj.polarity_scores(word)['compound']) <= -0.1:
        neg_word_list.append(word)
    else:
        neu_word_list.append(word)                

print('Positive:',pos_word_list)
print('Neutral:',neu_word_list)
print('Negative:',neg_word_list) 

print("Overall sentiment dictionary is : ", sentiment_dict) 
print("sentence was rated as ", sentiment_dict['neg']*100, "% Negative") 
print("sentence was rated as ", sentiment_dict['neu']*100, "% Neutral") 
print("sentence was rated as ", sentiment_dict['pos']*100, "% Positive") 

print("Sentence Overall Rated As", end = " ") 

# decide sentiment as positive, negative and neutral 
if sentiment_dict['compound'] >= 0.05 : 
    print("Positive") 

elif sentiment_dict['compound'] <= - 0.05 : 
    print("Negative") 

else : 
    print("Neutral") 

The output is as follows:

Positive: []
Neutral: ['Company', 'Caught', '.']
Negative: ['Counterfeit']
Overall sentiment dictionary is :  {'neg': 0.6, 'neu': 0.4, 'pos': 0.0, 'compound': -0.4588}
sentence was rated as  60.0 % Negative
sentence was rated as  40.0 % Neutral
sentence was rated as  0.0 % Positive
Sentence Overall Rated As Negative

It works perfectly for one word added within the lexicon. When I try to do the same using a CSV file by adding multiple words using the code below: I do not get the word Counterfeit added into my Vader Lexicon.

new_word={}
import csv
with open('Dictionary.csv', newline='') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        new_word[row['Word']] = int(row['Value'])
print(new_word)
sid_obj.lexicon.update(new_word)

The output for the above code is a dictionary which is updated to the lexicon. The dictionary looks like this (It has about 2000 words but I've only printed a few) It also consists of Counterfeit as a word:

{'CYBERATTACK': -2, 'CYBERATTACKS': -2, 'CYBERBULLYING': -2, 'CYBERCRIME': 
-2, 'CYBERCRIMES': -2, 'CYBERCRIMINAL': -2, 'CYBERCRIMINALS': -2, 
'MISCHARACTERIZATION': -2, 'MISCLASSIFICATIONS': -2, 'MISCLASSIFY': -2, 
'MISCOMMUNICATION': -2, 'MISPRICE': -2, 'MISPRICING': -2, 'STRICTLY': -2}

The output is as follows:

Positive: []
Neutral: ['Company', 'Caught', 'Counterfeit', '.']
Negative: []
Overall sentiment dictionary is :  {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}
sentence was rated as  0.0 % Negative
sentence was rated as  100.0 % Neutral
sentence was rated as  0.0 % Positive
Sentence Overall Rated As Neutral

Where am I going wrong when adding multiple words to the lexicon? The CSV file consists of two columns. One with the word and the other with the value as negative or positive number. Why does it still get identified as neutral? Any help will be appreciated. Thank you.

Rathan M
  • 41
  • 3
  • Even though you've mentioned that Counterfeit is a word in the dictionary, it would probably be less confusing if you included it in the sample `new_word` print output. Also, can you check if its value in the csv is negative and not neutral? – shriakhilc Jun 04 '19 at 05:52
  • 1
    @TheGamer007 I've checked it and the word counterfeit is in the CSV file with -2 as the value. Just in case, I reduced CSV entries to just 10 entries and tried with Cyberattack which was the first entry with -2 and still, got the same result i.e. Neutral – Rathan M Jun 04 '19 at 07:28
  • @TheGamer007 Solved it, thanks. Issue was that I put up my text in dictionary in Upper case. It's always supposed to be stored in lower case. – Rathan M Jun 04 '19 at 08:17

1 Answers1

1

Solved it, thanks. Issue was that I put up my text in dictionary in Upper case. It's always supposed to be stored in lower case. The dictionary words must be stored in lower case. Because Vader converts everything to lowercase before comparing.

Rathan M
  • 41
  • 3