1

I have a file.txt with the input

Straße
Straße 1
Straße 2

I want to read this text from file and print it. I tried this, but it won´t work.

lmao1 = open('file.txt').read().splitlines()
lmao =random.choice(lmao1)
print str(lmao).decode('utf8')

But I get the error:

UnicodeDecodeError: 'utf8' codec can't decode byte 0xdf in position 5: invalid continuation byte

Cœur
  • 37,241
  • 25
  • 195
  • 267
Dking1199
  • 27
  • 1
  • 2
  • 10

4 Answers4

1

Got it. If this doesn't work try other common encodings until you find the right one. utf-8 is not the correct encoding.

print str(lmao).decode('latin-1')
Evan
  • 2,120
  • 1
  • 15
  • 20
1

If on Windows, the file is likely encoded in cp1252.

Whatever the encoding, use io.open and specify the encoding. This code will work in both Python 2 and 3.

io.open will return Unicode strings. It is good practice to immediately convert to/from Unicode at the I/O boundaries of your program. In this case that means reading the file as Unicode in the first place and leaving print to determine the appropriate encoding for the terminal.

Also recommended is to switch to Python 3 where Unicode handling is greatly improved.

from __future__ import print_function
import io
import random
with io.open('file.txt',encoding='cp1252') as f:
    lines = f.read().splitlines()
line = random.choice(lines)
print(line)
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
0

You're on the right track, regarding decode, the problem is only there is no way to guess the encoding of a file 100%. Try a different encoding (e.g. latin-1).

Bastian Venthur
  • 12,515
  • 5
  • 44
  • 78
-1

It's working fine on Python prompt and while running from python script as well.

>>> import random
>>> lmao =random.choice(lmao1)
>>> lmao =random.choice(lmao1)
>>> print str(lmao).decode('utf8')
Straße 2

The above worked on Python 2.7. May I know your python version ?

jagatjyoti
  • 699
  • 3
  • 10
  • 29