Just a follow-up on the code provided by TennisVisuals in this discussion: Python split text on sentences I tried to parse the following paragraph in two sentences but the code (see the referred link) did not work. I was wondering if somebody else can reproduce the error.
The error I get is that the parser gives a len number of 1 item in the list of sentences for the paragraph, as if the period is not recognized as a sentence delimiter.
TwoSentencesParagraph = "The Minister must prepare an annual report on the implementation of specific programs. The report is included in the annual management report of the Ministere de l’Emploi et de la Solidarite sociale." The code is provided in the discussion Python split text on sentences.
It contains these lines (among several others):
def find_sentences(paragraph):
end = True
sentences = []
while end > -1:
end = find_sentence_end(paragraph)
if end > -1:
sentences.append(paragraph[end:].strip())
paragraph = paragraph[:end]
sentences.append(paragraph)
sentences.reverse()
return sentences