Stanford Entity Recognizer (caseless) in Python Nltk

Question

I am trying to figure out how to use the caseless version of the entity recognizer from NLTK. I downloaded http://nlp.stanford.edu/software/stanford-ner-2015-04-20.zip and placed it in the site-packages folder of python. Then I downloaded http://nlp.stanford.edu/software/stanford-corenlp-caseless-2015-04-20-models.jar and placed it in the folder. Then I ran this code in NLTK

from nltk.tag.stanford import NERTagger
english_nertagger = NERTagger(‘/home/anaconda/lib/python2.7/site-packages/stanford-ner-2015-04-20/classifiers/english.conll.4class.distsim.crf.ser.gz’, ‘/home/anaconda/lib/python2.7/site-packages/stanford-ner-2015-04-20/stanford-corenlp-caseless-2015-04-20-models.jar’)

But when I run this:

english_nertagger.tag(‘Rami Eid is studying at stony brook university in NY’.split())

I get an error:

Error: Could not find or load main class edu.stanford.nlp.ie.crf.CRFClassifier

Any help if you have experience is appreciated!

P.S. I can get the non-caseless version working fine but I find that when analysing search queries, users hardly ever capitalize words and the non-caseless version appears to completely miss words if they are not capitalized.

Nikita Astrakhantsev · Accepted Answer · 2016-07-15T13:40:07.993

9

The second parameter of StanfordNERTagger is the path to the stanford tagger jar file, not the path to the model. So, change it to stanford-ner.jar (and place it there, of course).

Also it seems that you should choose english.conll.4class.caseless.distsim.crf.ser.gz (from stanford-corenlp-caseless-2015-04-20-models.jar) instead of english.conll.4class.distsim.crf.ser.gz

Thus try the following:

 english_nertagger = StanfordNERTagger(‘/home/anaconda/lib/python2.7/site-packages/stanford-ner-2015-04-20/classifiers/english.conll.4class.caseless.distsim.crf.ser.gz’, ‘/home/anaconda/lib/python2.7/site-packages/stanford-ner-2015-04-20/stanford-ner.jar’)

Upd. NERTagger has been renamed to StanfordNERTagger

edited Jul 15 '16 at 13:40

answered Jun 11 '15 at 13:25

Nikita Astrakhantsev

4,701
1
15
26

1

Now `NERTagger` is not available in `nltk.tag.stanford`, Instead `StanfordNERTagger` is available. If you update your answer it's very helpful. – Rahul K P Jul 15 '16 at 13:03
I must be blind, I don't see `english.conll.4class.caseless.distsim.crf.ser.gz` in the classifier folder of the `2015-04-20` download – LYu Aug 13 '19 at 17:48

Stanford Entity Recognizer (caseless) in Python Nltk

1 Answers1

Linked