0

I am reading a text file with the following sentence:

"So whether you’re talking about a Walmart or an IKEA or a Zara, you are really interested in keeping the cost low, keeping the process very efficient."

my code:

files = "*.txt"
for pathname in glob.glob(files):
    with open(pathname,'r') as singlefile:
        data = "".join(singlefile.readlines())
        data = re.sub(r"(?<=\w)\n", " ", data)
        data = re.sub(r",\n", ", ", data)
        print data

result I got is

"So whether you鈥檙e talking about a Walmart or an IKEA or a Zara, you are really interested in keeping the cost low, keeping the process very efficient. That gives us operational excellence."

Can anyone tell me what is wrong? Thanks!

Niebieski
  • 591
  • 1
  • 8
  • 16

1 Answers1

0

If you get the encoding right (for this also look here, where they also describe an encoding guess list - which is a neat idea), it works just fine. I have tried it with:

import re

with open("words.txt",'r') as singlefile:
    data = "".join(singlefile.readlines())
    data = re.sub(r"(?<=\w)\n", " ", data)
    data = re.sub(r",\n", ", ", data)
    print data

And in the file "words.txt" is this:

 So whether you’re talking about a Walmart or an IKEA or a Zara, you are really interested in keeping the cost low, keeping the process very efficient.    

This is the output:

>>> runfile('E:/programmierung/python/spielwiese/test.py', wdir=r'E:/programmierung/python/spielwiese')
So whether you’re talking about a Walmart or an IKEA or a Zara, you are really interested in keeping the cost low, keeping the process very efficient.
>>> 
Community
  • 1
  • 1