0

This is my code, it currently prints the whole line, that contains the string. I want to print the whole sentence, that contains the searchString.

searchString = 'doctorate'
files = os.listdir() # goes directory with *.txt files
for eachFile in files:
    with open(eachFile, 'r', encoding='utf-8') as currentFile:
        allLines = currentFile.readlines()
        for line in allLines:
            if searchString in line:
                print (line)

Any help will be apreciated...

  • `read()` the whole document as a single string then split on `'.'`. You might also need to handle `'?'` and `'!`' and periods in the middle of sentences. – 001 Aug 16 '22 at 13:35
  • [How can I split a text into sentences?](https://stackoverflow.com/q/4576077) – 001 Aug 16 '22 at 13:46
  • Okay... I got some trouble fullText = currentFile.read() for searchString in fullText: if searchString in fullText: sentence = [ here I don't know how to proceed] – VeritLibert Aug 16 '22 at 14:10

1 Answers1

1

Here's a very simple, naïve approach to split the text into sentences:

import re

search = "doctorate"
with open(path_to_your_file, 'r') as file:
    text = file.read().replace('\n', ' ')
    sentences = re.split(r"([\.!?])\s+", text)
    for index,sentence in enumerate(sentences):
        if search in sentence.lower():
            print(sentence + sentences[index+1])

However, if your text is more complex, you will need more advanced parsing.

001
  • 13,291
  • 5
  • 35
  • 66