I want the count of positive and negative emoticons from tweets. I am using Python. I designed the following regular expressions for extracting positive and negative emotions respectively:
((:|;|8)+(-)*(\)|D|P|p)+)|((\()+(-)*(:|;)+)
((:)+(-|')*(\()+)|((\))+(-)*(:|;)+)
But I am getting a very low recall. I think one of the reasons may be that emoticons are often encoded. I have seen the following questions:
removing characters of a specific unicode range from a string
PHP : writing a simple removeEmoji function
However, the regular expressions suggested in the answers for these questions are fetching me many false positives, not to mention the fact that these were not designed or meant to tackle positive and negative emoticons separately.
I have also seen this Github link for a list of unicode smileys and this Wikipedia link for a list of emoticons. But I am not sure how to use them for my purpose.