8

In Python 2.7, see the following error, when trying to cast type to ensure it matches the output schema.

UnicodeEncodeError: 'ascii' codec can't encode character in position 0: ordinal not in range(128) Tried to find why and reproduced the error in Jupiter. By simply typing in.

str(u'\u2013')

What is the way to cast type to string that can handle this type of error? Thanks!

Bin
  • 3,645
  • 10
  • 33
  • 57
  • A side question, what u'\u'` pattern of string means? – Bin Jan 19 '18 at 18:24
  • 1
    It defines a Unicode string, which the Python 2 `str` type cannot hold. This is much more manageable in Python 3; you should definitely not be using Python 2 to learn Python any longer! – tripleee Jan 19 '18 at 18:31

3 Answers3

22

Try this:

u'\u2013'.encode('utf-8')
akhilsp
  • 1,063
  • 2
  • 13
  • 26
14

I will answer my own question. Found an duplicated question. stackoverflow.com/questions/9942594/

But for simplicity, here is an elegant solution that works well with my use case:

def safe_str(obj):
    try: return str(obj)
    except UnicodeEncodeError:
        return obj.encode('ascii', 'ignore').decode('ascii')
    return ""

safe_str(u'\u2013')

Or simply use:

u'\u2013'.encode('ascii', 'ignore')
Bin
  • 3,645
  • 10
  • 33
  • 57
2

For version 2.7.x ,encoding is not set by default. Please use below code as a first line of program

# -*- coding: utf-8 -*-
# Your code goes below this line

It should solve your problem.

For python 3.x ,there is default encoding.hence there will be no issue of encoding.

tripleee
  • 175,061
  • 34
  • 275
  • 318
Ashish Bainade
  • 426
  • 5
  • 11