There are a lot of similar questions and I have tried every possible solution but can't seem to work it out. This is my code and I am working on Name Entity Recognition using Stanford Tagger.
from nltk.tag import StanfordNERTagger
st = StanfordNERTagger('stanford-ner\classifiers\english.all.3class.distsim.crf.ser.gz',
'stanford-ner\stanford-ner.jar', encoding='utf-8')
tuple_list = st.tag("Please pay €94 million.".split())
print(tuple_list)
This is the error I get.
Traceback (most recent call last):
File "C:/Users/Dell/PycharmProjects/CSSOP/ner2.py", line 4, in <module>
tuple_list = st.tag("He was the subject of the most expensive association football transfer when he moved from Manchester United to Real Madrid in 2009 in a transfer worth €94 million ($132 million).".split())
File "C:\ProgramData\Anaconda3\lib\site-packages\nltk\tag\stanford.py", line 71, in tag
return sum(self.tag_sents([tokens]), [])
File "C:\ProgramData\Anaconda3\lib\site-packages\nltk\tag\stanford.py", line 95, in tag_sents
stanpos_output = stanpos_output.decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 247: invalid start byte
Edit: This is not a file opening encoding issue as pointed in other similar question.