UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)

Question

when i run my code i get this error:

UserId = "{}".format(source[1]) UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)

My code is:

def view_menu(type, source, parameters):
    ADMINFILE = 'static/users.txt'
    fp = open(ADMINFILE, 'r')
    users = ast.literal_eval(fp.read())
    if not parameters:
        if not source[1] in users:
            UserId = "{}".format(source[1])
            users.append(UserId)
            write_file(ADMINFILE,str(users))
            fp.close()
            reply(type, source, u"test")
        else:
            reply(type, source, u"test")

register_command_handler(view_menu, 'test', ['info','muc','all'], 0, '')

Please how i can solve this problem.

Thank you

It's worth pointing out here that this problem is exactly why Python 3.x exists. Are you sure you want to learn all the clumsy stuff necessary to deal with mixing Unicode and non-Unicode strings in an old version of the language just to learn everything all over again in a year or two, rather than just learning the easier and newer way now? — abarnert, Aug 03 '14 at 10:13

score 6 · Answer 1 · answered Aug 03 '14 at 10:11

The problem is that "{}" is non-Unicode str, and you're trying to format a unicode into it. Python 2.x handles that by automatically encoding the unicode with sys.getdefaultencoding(), which is usually 'ascii', but you have some non-ASCII characters.

There are two ways to solve this:

Explicitly encode that unicode in the appropriate character set. For example, if it's UTF-8, do "{}".format(source[1].encode('utf-8')).
Use a unicode format string: u"{}".format(source[1]). You may still need to encode that UserId later; I have no idea how your write_file function works. But it's generally better to keep everything Unicode as long as possible, only encoding and decoding at the very edges, than to try to mix and match the two.

All that being said, this line of code is useless. "{}".format(foo) converts foo to a str, and then formats it into the exact same str. Why?

thank you. now it is work when i use `"{}".format(source[1].encode('utf-8'))` — yuyb0y, Aug 04 '14 at 14:49

score 5 · Answer 2 · edited Oct 28 '15 at 21:13

Take these functions here when handling strings of unknown encoding:

You want to work with the text?

def read_unicode(text, charset='utf-8'):
    if isinstance(text, basestring):
        if not isinstance(text, unicode):
            text = unicode(obj, charset)
    return text

You want to store the text, for example in a database, use this:

def write_unicode(text, charset='utf-8'):
    return text.encode(charset)

score 0 · Answer 3 · edited May 23 '17 at 11:54

0

a solution is to set a default encoding to utf-8 instead of ascii in your sitecustomize.py

Changing default encoding of Python?

edited May 23 '17 at 11:54

Community

1
1

answered Aug 05 '14 at 08:13

user3876129

21
3

score -2 · Answer 4 · answered Aug 03 '14 at 10:11

-2

Your file static/users.txt must contain any non-unicode characters. You must specify any encoding in your program. for intsnace utf-8. You can read more about it here: Unicode HOWTO.

answered Aug 03 '14 at 10:11

amatellanes

3,645
2
17
19

A character whose identity is not assigned by means of the Unicode tables. – amatellanes Aug 03 '14 at 10:31
@amatellanes please can you check this quenstions `http://stackoverflow.com/questions/25088887/nimbuzz-login-config-code-dosnt-work` and help me with my problem. – yuyb0y Aug 04 '14 at 15:44
@amatellanes: His file almost certainly does not contain any non-Unicode characters. And, if it did, UTF-8 wouldn't help, because UTF-8 only encodes Unicode characters. – abarnert Aug 05 '14 at 03:26

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)

4 Answers4

Linked