0

I try to label my data training in the form of a document with 2 categories, namely positive and negative, by separating per word on the document with the tokenizing method then the record in the tokenized document is compared with a sentiment dictionary how many positive words and negative words in that 1 record, then the total number of positive and negative values ​​is compared, whichever is more, then the record will be labeled according to the sentiment whose value is more dominant. I need a glimpse of how to do it in python

Arkan
  • 55
  • 1
  • 5
  • have you chosen a sentiment lexicon? that would be a good place to start, there are many good ones available but you need to bear in mind what languages it contains compared to your data and the dialect it was meant for, using a US lexicon for UK or Australian data can cause issues – Patrick Nov 05 '21 at 08:29
  • Yes i've checked that dictionary, but idk how to implement it to my problem TT – Arkan Nov 05 '21 at 09:42
  • if you have a lexicon then you have a ladled dataset, what your asking would be better suited to looking for a sentiment analysis tutorial and then asking more specific questions if you encounter problems – Patrick Nov 05 '21 at 09:46
  • Yes ive been doing it and i got another problem, please visit this link https://stackoverflow.com/questions/69851442/q-python-spell-checker-using-nltk – Arkan Nov 05 '21 at 10:00

0 Answers0