I am trying to figure out how to use the caseless version of the entity recognizer from NLTK. I downloaded http://nlp.stanford.edu/software/stanford-ner-2015-04-20.zip and placed it in the site-packages folder of python. Then I downloaded http://nlp.stanford.edu/software/stanford-corenlp-caseless-2015-04-20-models.jar and placed it in the folder. Then I ran this code in NLTK
from nltk.tag.stanford import NERTagger
english_nertagger = NERTagger(‘/home/anaconda/lib/python2.7/site-packages/stanford-ner-2015-04-20/classifiers/english.conll.4class.distsim.crf.ser.gz’, ‘/home/anaconda/lib/python2.7/site-packages/stanford-ner-2015-04-20/stanford-corenlp-caseless-2015-04-20-models.jar’)
But when I run this:
english_nertagger.tag(‘Rami Eid is studying at stony brook university in NY’.split())
I get an error:
Error: Could not find or load main class edu.stanford.nlp.ie.crf.CRFClassifier
Any help if you have experience is appreciated!
P.S. I can get the non-caseless version working fine but I find that when analysing search queries, users hardly ever capitalize words and the non-caseless version appears to completely miss words if they are not capitalized.