2

I'm trying to read a file and when I'm reading it, I'm getting a unicode error.

def reading_File(self,text):

     url_text =  "Text1.txt"
     with open(url_text) as f:
                content = f.read()

Error:

content = f.read()# Read the whole file
 File "/home/soft/anaconda/lib/python3.6/encodings/ascii.py", line 26, in 
 decode
 return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position 404: 
ordinal not in range(128)

Why is this happening? I'm trying to run the same on Linux system, but on Windows it runs properly.

snakecharmerb
  • 47,570
  • 11
  • 100
  • 153
BTG123
  • 137
  • 1
  • 3
  • 13
  • Tip : Never mix 4 spaces and 8 spaces indentation. – Vineeth Sai Sep 20 '18 at 06:45
  • Does this answer your question? [UnicodeDecodeError: 'ascii' codec can't decode byte 0xd1 in position 2: ordinal not in range(128)](https://stackoverflow.com/questions/10406135/unicodedecodeerror-ascii-codec-cant-decode-byte-0xd1-in-position-2-ordinal) – Boris Verkhovskiy Dec 11 '19 at 19:39

4 Answers4

3

According to the question,

i'm trying to run the same on Linux system, but on Windows it runs properly.

Since we know from the question and some of the other answers that the file's contents are neither ASCII nor UTF-8, it's a reasonable guess that the file is encoded with one of the 8-bit encodings common on Windows.

As it happens 0x92 maps to the character 'RIGHT SINGLE QUOTATION MARK' in the cp125* encodings, used on US and latin/European regions.

So probably the the file should be opened like this:

# Python3
with open(url_text, encoding='cp1252') as f:
    content = f.read()

# Python2
import codecs
with codecs.open(url_text, encoding='cp1252') as f:
    content = f.read()
snakecharmerb
  • 47,570
  • 11
  • 100
  • 153
1

There can be two reasons for that to happen:

  1. The file contains text encoded with an encoding different than 'ascii' and, according you your comments to other answers, 'utf-8'.

  2. The file doesn't contain text at all, it is binary data.

In case 1 you need to figure out how the text was encoded and use that encoding to open the file:

open(url_text, encoding=your_encoding)

In case 2 you need to open the file in binary mode:

open(url_text, 'rb')
Stop harming Monica
  • 12,141
  • 1
  • 36
  • 56
0

You can use codecs.open to fix this issue with the correct encoding:

import codecs
with codecs.open(filename, 'r', 'utf8' ) as ff:
    content = ff.read()
napuzba
  • 6,033
  • 3
  • 21
  • 32
  • Tried but got this UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 404: invalid start byte – BTG123 Sep 20 '18 at 06:57
0

As it looks, default encoding is ascii while Python3 it's utf-8, below syntax to open the file can be used

open(file, encoding='utf-8')

Check your system default encoding,

>>> import sys
>>> sys.stdout.encoding
'UTF-8'

If it's not UTF-8, reset the encoding of your system.

 export LANGUAGE=en_US.UTF-8
 export LC_ALL=en_US.UTF-8
 export LANG=en_US.UTF-8
 export LC_TYPE=en_US.UTF-8
Bijendra
  • 9,467
  • 8
  • 39
  • 66
  • you mean o/p of sys.stdout.encoding. Then reset the encoding of the unix system you are using as python3 follows it – Bijendra Sep 20 '18 at 07:15
  • Look into this, it may be related https://stackoverflow.com/questions/10406135/unicodedecodeerror-ascii-codec-cant-decode-byte-0xd1-in-position-2-ordinal – Bijendra Sep 20 '18 at 07:21