-2

I am trying to read a dictionary_format_file.txt but I keep getting the error at line.

I read other posts and they made complete sense, however I could not fix my problem still.

Any help is appreciated.

import ast

path = '/Users/xyz/Desktop/final/'
filename = 'dictionary_format_text_file.txt'

with open((path+filename), 'r') as f:
    s=f.read()
    s=s.encode('ascii', 'ignore').decode('ascii')

Error:

  Traceback (most recent call last):
  File "/Users/xyz/Desktop/final/boolean_query.py", line 347, in <module>
    s=s.encode('ascii', 'ignore').decode('ascii')

  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
builtins.UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 81201: ordinal not in range(128)
user3295864
  • 49
  • 2
  • 9
  • 1
    Possible duplicate of [UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 20: ordinal not in range(128)](http://stackoverflow.com/questions/9942594/unicodeencodeerror-ascii-codec-cant-encode-character-u-xa0-in-position-20) – DYZ Mar 01 '17 at 02:40
  • Apparently you didn't search very hard. @DYZ and I both found it within 4 minutes of your post. – Ken White Mar 01 '17 at 02:42
  • 1
    Side-note: You used a `with` statement, so calling `f.close()` is redundant/wrong. The whole point of the `with` statement is that it guarantees resources are freed when the block is exited. – ShadowRanger Mar 01 '17 at 02:46
  • @ShadowRanger Understood. Thank you. – user3295864 Mar 01 '17 at 02:47
  • @DYZ I did look over that example but I am not sure how to use that to solve my problem. I am not using the str() function – user3295864 Mar 01 '17 at 02:47
  • `literal_eval` uses `str()`, so, same difference. – DYZ Mar 01 '17 at 02:53
  • @DYZ ' yourstring = yourstring.encode('ascii', 'ignore').decode('ascii')' . I tried using this but it keeps giving me the same error at this line. DYZ could you possibly tell me how to fix it? – user3295864 Mar 01 '17 at 02:58
  • Firstly, what is it in you file that causes the error? Do you know what non-ASCII symbols are in the file? – DYZ Mar 01 '17 at 03:01
  • `literal_eval` is trying to decode the bytes you read from the file since you didn't provide Unicode. Open the file with `codecs.open` and provide the proper encoding parameter. – Mark Ransom Mar 01 '17 at 03:10
  • The file is a dictionary_format.txt file. ' {....}. The file was written using ' file=("file_name",'w'). So I dont think so that there would be any non-ascii characters. Also this error is only occuring on my macintosh machine and not on linux desktop. – user3295864 Mar 01 '17 at 03:12
  • P.S. it's much more useful to see the whole verbose stack trace than to just see the one line. Please try to edit that into your question. – Mark Ransom Mar 01 '17 at 03:12

1 Answers1

2

f.read is returning a byte string, not a Unicode string. When you try to use encode on it, Python 2 tries to decode it first using the 'ascii' codec with errors turned on (Python 3 would simply give you an error without trying to decode). It's that hidden decode that is generating the error. You can easily avoid it by getting rid of the redundant encode:

s=s.decode('ascii', 'ignore')
Mark Ransom
  • 299,747
  • 42
  • 398
  • 622