I hope everyone is doing well.
I have been following SentDex's Youtube tutorial on using NLTK, with the aim of creating a name recognition program. As you can see from the code below, I have managed to 'chunk' names. However, what I would like to do is put all of the 'chunked' names into an array so I can easily select the names. Is this possible? If not is there another way of doing it?
import nltk
from nltk.corpus import state_union
from nltk.tokenize import PunktSentenceTokenizer
train_text = state_union.raw("2005-GWBush.txt")
sample_text = state_union.raw("2006-GWBush.txt")
custom_sent_tokenizer = PunktSentenceTokenizer(train_text)
tokenized = custom_sent_tokenizer.tokenize(sample_text)
namedEnt=""
def process_content():
try:
for i in tokenized[5:]:
words = nltk.word_tokenize(i)
tagged = nltk.pos_tag(words)
namedEnt = nltk.ne_chunk(tagged,binary=True)
namedEnt.draw()
except Exception as e:
print(str(e))
process_content()