0

I am trying to read a text file which contains non ascii characters. However, it seems elements of the file are not in unicode.

# -*- coding: utf-8 -*-
from __future__ import unicode_literals



with open("/home/biswadip/Desktop/test", "r") as f:
    content = f.read().splitlines()
print(content)
a= content[0]
print(type(a))

It produces the following output:

['\xc2\xb0', 'a', 'b']

<type 'str'>

Now as it's not a unicode string, I can't do normal operations such as adding another string to it. It produces the "'ascii' codec can't decode" error. I thought from __future__ import unicode_literals was supposed to take care of this issue, but apparently, it's not working. I know I can reload the system with reload(sys) and set default encoding to utf-8, but I think that is not a viable solution.

NickD
  • 5,937
  • 1
  • 21
  • 38
Biswadip Mandal
  • 534
  • 1
  • 4
  • 15

0 Answers0