I am new to pyhton and nltk.I want to tokenize a string and add a few string to the split list in nltk.I used the code from the post How to tweak the NLTK sentence tokenizer. Below is the code which I have written
from nltk.tokenize import sent_tokenize
extra_abbreviations = ['\n']
sentence_tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
sentence_tokenizer._params.abbrev_types.update(extra_abbreviations)
sent_tokenize_list = sentence_tokenizer(document)
sent_tokenize_list
This gives me the following error:
TypeError Traceback (most recent call last) in () 4 sentence_tokenizer._params.abbrev_types.update(extra_abbreviations) 5 ----> 6 sent_tokenize_list = sentence_tokenizer(document) 7 sent_tokenize_list
TypeError: 'PunktSentenceTokenizer' object is not callable
How do I fix this?