28

Are there any R packages that focus on sentiment analysis? I have a small survey where users can write a comment about their experience of using a web-tool. I ask for a numerical ranking, and there is the option of including a comment.

I am wondering what the best way of assessing the positiveness or negativeness of the comment is. I would like to be able to compare it to the numerical ranking that the user provides, using R.

Mansfield
  • 14,445
  • 18
  • 76
  • 112
djq
  • 14,810
  • 45
  • 122
  • 157
  • 2
    Check out Jeffery Breen's work here: http://www.slideshare.net/jeffreybreen/r-by-example-mining-twitter-for – mweylandt Apr 19 '12 at 17:09
  • @mweylandt, as a fellow Jeffrey myself, it's "r-e-y." But it seems like a simple, neat method. – Jeff Allen Apr 19 '12 at 19:24
  • Jeffrey Breen provide an excellent guide above all to beginners in Text Mining like me. I promote to visit link shared by Paras. From that link you can go to professor Bing Liu website which is specialized on the subject: [Opinion Mining, Sentiment Analysis, and Opinion Spam Detection][1] [1]: http://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html Regards, Rod – rodobastias Feb 18 '13 at 12:30

5 Answers5

26

And there is this package:

sentiment: Tools for Sentiment Analysis

sentiment is an R package with tools for sentiment analysis including bayesian classifiers for positivity/negativity and emotion classification.

Update 14 Dec 2012: it has been removed to the archive...

Update 15 Mar 2013: the qdap package has a polarity function, based on Jeffery Breen's work

Ben
  • 41,615
  • 18
  • 132
  • 227
18

Here's the work I've done on sentiment analysis in R.

The code is, by no means, polished or well-packaged, but I posted it on Github with basic documentation. I used the ViralHeat sentiment API, which just returns JSON, so the actual function to do the sentiment analysis is pretty trivial (see code here).

Feel free to contact me if you're having trouble using it. And note that you'll need to register for an API key with ViralHeat before you'll be able to use it. If you're finding the quotas too restrictive, I had contacted them and they were happy to give me a ton more queries for a few months while I played around with the API.

Jeff Allen
  • 17,277
  • 8
  • 49
  • 70
5

For step by step guide to use 1) Viral Heat API 2) Jeffrey Breen's approach 3) Using Sentiment Package, check out this link: https://sites.google.com/site/miningtwitter/questions/sentiment

paras_doshi
  • 1,027
  • 1
  • 12
  • 19
2

I've tried to reorganize and provide a cohesive sentiment analysis package here. SentR includes word stemming and preprocessing and provides access to the ViralHeat API, a default aggregating function as well as a more advanced Naive Bayes method.

Installing is relatively simple:

install.packages('devtools')
require('devtools')
install_github('mananshah99/sentR')
require('sentR')

And a simple classification example:

# Create small vectors for happy and sad words (useful in aggregate(...) function)
positive <- c('happy', 'well-off', 'good', 'happiness')
negative <- c('sad', 'bad', 'miserable', 'terrible')

# Words to test sentiment
test <- c('I am a very happy person.', 'I am a very sad person', 
'I’ve always understood happiness to be appreciation. There is no greater happiness than appreciation for what one has- both physically and in the way of relationships and ideologies. The unhappy seek that which they do not have and can not fully appreciate the things around them. I don’t expect much from life. I don’t need a high paying job, a big house or fancy cars. I simply wish to be able to live my life appreciating everything around me. 
')

# 1. Simple Summation
out <- classify.aggregate(test, positive, negative)
out

# 2. Naive Bayes
out <- classify.naivebayes(test)
out

Which provides the following output:

  score
1     1
2    -1
3     2

     POS                NEG                 POS/NEG             SENT      
[1,] "9.47547003995745" "0.445453222112551" "21.2715265477714"  "positive"
[2,] "1.03127774142571" "9.47547003995745"  "0.108836578774127" "negative"
[3,] "67.1985217685598" "35.1792261323723"  "1.9101762362738"   "positive"

Please feel free to contribute :) Hope that helps!

manan
  • 1,385
  • 13
  • 23
  • hi Manan, I like your solution. I tried and will experiment more. Do you have any use case like any project done that you made public for other people to use? Thx – seakyourpeak Dec 30 '15 at 02:09
  • @seakyourpeak thanks for the comment! I'm working on a sample Twitter sentiment extraction repository (github.com/manans99), but for the time being the documentation for each function includes a sample use case. If you have any further questions feel free to PM me. – manan Dec 31 '15 at 19:10
  • @manan I am currently working on Facebook post data. I have been able to extract the post and construct a wordcloud. I was wondering if you think it is a good idea to use the most common word for my list negative and positive. Ex: if in my wordcloud I found like,bad,great,love,happy,sad, plane, car, transport....I would use like,great,love,happy as positive classifier and sad,bad as negative.. ? – Nico Coallier Mar 16 '17 at 20:19
0

You can still use the sentiment package. Install it following the script below.

You may need R 3.x.

require(devtools)
install_url("http://cran.r-project.org/src/contrib/Archive/sentiment/sentiment_0.2.tar.gz")
require(sentiment)
ls("package:sentiment")
Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880
Frank Wang
  • 1,462
  • 3
  • 17
  • 39
  • 1
    setiment package is dependent on rstem package , which is also not supported by R 3.0.2 – Nishanth Lawrence Reginold Apr 09 '14 at 02:50
  • Yes, even the source site: https://sites.google.com/site/miningtwitter/home warns: Due to changes in twitter APIs, the code in this google site is no longer supported... although you are more than welcome to browse its content – Matt Jul 26 '14 at 20:40