54

I was wondering if anybody knew where I could obtain dictionaries of positive and negative words. I'm looking into sentiment analysis and this is a crucial part of it.

Raj
  • 22,346
  • 14
  • 99
  • 142
user387049
  • 6,647
  • 8
  • 53
  • 55

9 Answers9

38

The Sentiment Lexicon, at the University of Pittsburgh might be what you are after. It's a lexicon of about 8,000 words with positive/neutral/negative sentiment. It's described in more detail in this paper and released under the GPL.

Tim
  • 2,123
  • 4
  • 27
  • 44
Stompchicken
  • 15,833
  • 1
  • 33
  • 38
28

Sentiment Analysis (Opinion Mining) lexicons


Sources:

Kurt Bourbaki
  • 11,984
  • 6
  • 35
  • 53
  • 1
    moved: [SentiWordNet](https://ontotext.fbk.eu/sentiwn.html) ([SentiWordNet on github](https://github.com/aesuli/SentiWordNet)). Harvard General Inquirer is gone ([DictionaryGI.rda in SentimentAnalysis.R](https://github.com/sfeuerriegel/SentimentAnalysis/tree/master/data)). vader moved to [vader_lexicon.txt](https://github.com/cjhutto/vaderSentiment/blob/master/vaderSentiment/vader_lexicon.txt) – milahu Mar 10 '23 at 08:35
  • @milahu Thank you. Do you know why the Harvard General Inquirer website has been taken down? I couldn't find any info – Kurt Bourbaki Mar 12 '23 at 19:46
26

Arriving a bit late I'll just note that dictionaries have a limited contribution for sentiment analysis. Some sentiment bearing sentences do not contain any "sentiment" word - e.g. "read the book" which could be positive in a book review while negative in a movie review. Similarly, the sentiment word "unpredictable" could be positive in the context of a thriller but negative when describing the breaks system of the Toyota.

and there are many more...

ScienceFriction
  • 1,538
  • 2
  • 18
  • 29
  • 1
    Really good points. Luckily for me I'm dealing with only certain news sources who are would refrain from using slang and are generally just stating facts. Still definitely something to worry about though, thanks. – user387049 Feb 17 '11 at 23:35
  • 2
    I think when using dictionaries without context, the hope is that while there may be a certain amount of noise (misclassification) for individual sentences, there will be enough signal in the aggregate to be meaningful. I'm not sure how one would go about testing this hope with statistical rigor, though. – mcduffee Aug 08 '14 at 16:24
12

Professor Bing Liu provide an English Lexicon of about 6800 word, you can download form this link: Opinion Mining, Sentiment Analysis, and Opinion Spam Detection

Community
  • 1
  • 1
rodobastias
  • 121
  • 1
  • 3
7

This paper from 2002 describes an algorithm for deriving such a dictionary from text samples automatically, using only two words as a seed set.

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • 3
    The problem is that this approach uses AltaVista hits to compute PMI-IR, so I do not think it is optimal for someone who wants to get started. Moreover it is an unsupervised approach, and its results are still not exciting if compared to supervised approaches. – Kurt Bourbaki Jul 14 '15 at 07:19
  • cannot access the link? could you please mention the title of the page – zacknight95 Oct 25 '21 at 17:00
4

AFINN you can find here and also create it dynamically. Like whenever unknown +ve word comes add it with +1. Like banana is new +ve word and appearing twice then it will become +2.

As much articles and data you craws your dictionary would become stronger!

user123
  • 5,269
  • 16
  • 73
  • 121
  • 4
    That file is really a toy file, created for a class assignment. In my opinion, it would be a mistake to use it for real work. – mcduffee Aug 08 '14 at 16:28
  • @mcduffee Elaborate? – jzonthemtn Jan 18 '16 at 14:50
  • @jbird I'm not sure what I can add. The file was created for a class assignment, where the text to evaluate was tailored to the words in the list. It is missing many, many words (the entire list is less than 2500 words). Attempting to use it with text which has not been tailored to the words in the list would, I fear, result in less accurate assessments of sentiment than a more complete list would provide. – mcduffee Jan 20 '16 at 00:57
3

The Harvard-IV dictionary directory http://www.wjh.harvard.edu/~inquirer/homecat.htm has at least two sets of ready-to-use dictionaries for positive/negative orientation.

Kiara
  • 31
  • 1
3

You can use vader sentiment lexicon

from nltk.sentiment.vader import SentimentIntensityAnalyzer

sentence='APPle is good for health'
sid = SentimentIntensityAnalyzer()
ss = sid.polarity_scores(sentence)  
print(ss)

it will give you the polarity of sentence.

output:

 {'compound': 0.4404, 'neu': 0.58, 'pos': 0.42, 'neg': 0.0}
Techgeeks1
  • 556
  • 1
  • 4
  • 18
3

Sentiwords gives 155,000 words (and their polarity, that is, a score between -1 and 1 for very negative through to very positive). The lexicon is discussed here

stevec
  • 41,291
  • 27
  • 223
  • 311