The following code preserves apostrophes and blanks, and could easily be modified to preserve double quotations marks, if desired. It works by using a translation table based on a subclass of the string object. I think the code is fairly easy to understand. It might be made more efficient if necessary.
class SpecialTable(str):
def __getitem__(self, chr):
if chr==32 or chr==39 or 48<=chr<=57 \
or 65<=chr<=90 or 97<=chr<=122:
return chr
else:
return None
specialTable = SpecialTable()
with open('temp2.txt') as inputText:
for line in inputText:
print (line)
convertedLine=line.translate(specialTable)
print (convertedLine)
print (convertedLine.split(' '))
Here's typical output.
This! is _a_ single (i.e. 1) English sentence that won't cause any trouble, right?
This is a single ie 1 English sentence that won't cause any trouble right
['This', 'is', 'a', 'single', 'ie', '1', 'English', 'sentence', 'that', "won't", 'cause', 'any', 'trouble', 'right']
'nother one.
'nother one
["'nother", 'one']