I am currently working on a sentiment analysis project for the first time that will use tweets as input. The topic of these tweets is sports, currently i am preprocessing the data and trying to assign a polarity to them. The many different ways of assigning these sentiment scores is confusing me a bit, thus i have some questions:
This thread (Training data for sentiment analysis) list some corpora, however none of them applies to sports. Can I use one of these to train a classifier that applies to my case? Or would the use of a irrelevant corpus skew the results?
Could it be possible to achieve good results by relying on a lexicon for this topic (e.g. above link)?
Should I query my db and manually annotate tweets in order to train a classifier?
Thanks