Extracting the person names in the named entity recognition in NLP using Python

Question

I have a sentence for which i need to identify the Person names alone:

For example:

sentence = "Larry Page is an American business magnate and computer scientist who is the co-founder of Google, alongside Sergey Brin"

I have used the below code to identify the NERs.

from nltk import word_tokenize, pos_tag, ne_chunk
print(ne_chunk(pos_tag(word_tokenize(sentence))))

The output i received was:

(S
  (PERSON Larry/NNP)
  (ORGANIZATION Page/NNP)
  is/VBZ
  an/DT
  (GPE American/JJ)
  business/NN
  magnate/NN
  and/CC
  computer/NN
  scientist/NN
  who/WP
  is/VBZ
  the/DT
  co-founder/NN
  of/IN
  (GPE Google/NNP)
  ,/,
  alongside/RB
  (PERSON Sergey/NNP Brin/NNP))

I want to extract all the person names, such as

Larry Page
Sergey Brin

In order to achieve this, I refereed this link and tried this.

from nltk.tag.stanford import StanfordNERTagger
st = StanfordNERTagger('/usr/share/stanford-ner/classifiers/english.all.3class.distsim.crf.ser.gz','/usr/share/stanford-ner/stanford-ner.jar')

However i continue to get this error:

LookupError: Could not find stanford-ner.jar jar file at /usr/share/stanford-ner/stanford-ner.jar

Where can i download this file?

As informed above, the result that i am expecting in the form of list or dictionary is :

Larry Page
Sergey Brin

@rainer - Thank you so much. I will edit the error now which i am getting — Doubt Dhanabalu, Mar 20 '18 at 15:08
@DoubtDhanabalu You have to download the jar from https://nlp.stanford.edu/software/CRF-NER.html#Download and provide that path in StanfordNERTagger — kanatti, Mar 20 '18 at 15:18
See https://stackoverflow.com/questions/13883277/stanford-parser-and-nltk — alvas, Mar 21 '18 at 03:41

score 11 · Accepted Answer · answered Mar 21 '18 at 03:47

In Long

Please read these carefully:

Understand the solution, don't just copy and paste.

TL;DR

In terminal:

pip install -U nltk

wget http://nlp.stanford.edu/software/stanford-corenlp-full-2016-10-31.zip
unzip stanford-corenlp-full-2016-10-31.zip && cd stanford-corenlp-full-2016-10-31

java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer \
-preload tokenize,ssplit,pos,lemma,parse,depparse \
-status_port 9000 -port 9000 -timeout 15000

In Python

from nltk.tag.stanford import CoreNLPNERTagger

def get_continuous_chunks(tagged_sent):
    continuous_chunk = []
    current_chunk = []

    for token, tag in tagged_sent:
        if tag != "O":
            current_chunk.append((token, tag))
        else:
            if current_chunk: # if the current chunk is not empty
                continuous_chunk.append(current_chunk)
                current_chunk = []
    # Flush the final current_chunk into the continuous_chunk, if any.
    if current_chunk:
        continuous_chunk.append(current_chunk)
    return continuous_chunk


stner = CoreNLPNERTagger()
tagged_sent = stner.tag('Rami Eid is studying at Stony Brook University in NY'.split())

named_entities = get_continuous_chunks(tagged_sent)
named_entities_str_tag = [(" ".join([token for token, tag in ne]), ne[0][1]) for ne in named_entities]


print(named_entities_str_tag)

[out]:

[('Rami Eid', 'PERSON'), ('Stony Brook University', 'ORGANIZATION'), ('NY', 'LOCATION')]

You might find this help too: Unpacking a list / tuple of pairs into two lists / tuples

ImportError: cannot import name 'CoreNLPNERTagger' from 'nltk.tag.stanford' (/home/akshatz/.local/lib/python3.8/site-packages/nltk/tag/stanford.py) — Akshat Zala, Aug 05 '20 at 13:51

score 0 · Answer 2 · answered Nov 16 '18 at 09:41

In the first place you need to download the jar files and the rest of the necessary files. Follow the link : https://gist.github.com/troyane/c9355a3103ea08679baf. Run the code to download the files(except the last few line). Once done with the downloading part you are now ready to do the extraction part.

from nltk.tag.stanford import StanfordNERTagger
st = StanfordNERTagger('/home/saheli/Downloads/my_project/stanford-ner/english.all.3class.distsim.crf.ser.gz',
                   '/home/saheli/Downloads/my_project/stanford-ner/stanford-ner.jar')

Extracting the person names in the named entity recognition in NLP using Python

2 Answers2

In Long

TL;DR

Linked