2

I am trying to use Tamil languge in Python. But ran into difficulties. Here is my code

U=u'\u0B83'
print U

This throws the error,

UnicodeEncodeError: 'ascii' codec can't encode character u'\u0b83' in position 0 : ordinal not in range(128)

My defaultencoding in ascii. As u0b83 is already in unicode, it should print the character in Tamil.

I tried to this too, # -- coding: utf-8 --. But results are same.

How do I set this in unicode?

Rajasankar
  • 928
  • 1
  • 19
  • 41

2 Answers2

5

In Linux at least, you can set your locale to use UTF-8 before starting Python:

$ export LC_ALL=en_GB.utf8
$ python

You can of course use any locale with a compatible encoding (but I recommend UTF-8).

Alternatively, encode the string when outputting it:

>> print U.encode('utf-8')
ஃ
deltab
  • 2,498
  • 23
  • 28
1

What I needed is raw-unicode-escape.

If I use encode('raw-unicode-escape').decode('utf-8') everything works perfectly. I found the answer here, Python Convert Unicode-Hex utf-8 strings to Unicode strings

Community
  • 1
  • 1
Rajasankar
  • 928
  • 1
  • 19
  • 41