I'm using this code to parse a text file and format it in a way that puts every sentence in a new line:
import re
# open the file to be formatted
filename=open('inputfile.txt','r')
f=filename.read()
filename.close()
# put every sentence in a new line
pat = ('(?<!Dr)(?<!Esq)\. +(?=[A-Z])')
lines = re.sub(pat,'.\n',f)
print lines
# write the formatted text
# into a new txt file
filename = open("outputfile.txt", "w")
filename.write(lines)
filename.close()
But essentially I need to split the sentences after 110 characters. So in case when a sentence in a line is longer than 110, it would split it and add '...' in the end, and then start a new line with '...' and following other part of the splitted sentence, and so on.
Any suggestions how to do that? I'm somehow lost.