0

In my following python code

# coding: utf-8
# generate.py
my_locale = [
            {
                'tag':'test_tag_long__body',
                'locale':'en_US',
                'text': u"Hello,<br>\n"
                        u"\n"
                        u"<br>\n"
                        u"\n"
                        u"We’re contacting you.\n"
                        u"\n"
                        u"<br>\n"
                        u"\n"
                        u"Sincerely,<br>\n"
                        u"\n"
                        u"<br>\n"
                        u"\n"
                        u"Team<br>\n"
            },
            {
                'tag':'test_tag_long__subject',
                'locale':'en_US',
                'text': 'Important information'
            },
        ]

print "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
print "<strings>"
for item in my_locale:
    print "<string>"
    print "<tag>" + item['tag'] + "</tag>"
    print "<locale>" + item['locale'] + "</locale>"
    print "<text><![CDATA["
    print  item["text"]
    print "]]>"
    print "</text>"
    print "</string>"
print "</strings>"

When I run it as python generate.py, it runs fine with no error. However, whenever I pipe or redirect output, it gives error

python generate.py | pbcopy
Traceback (most recent call last):
  File "generate2.py", line 34, in <module>
    print  item["text"]
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 20: ordinal not in range(128)

Am I missing anything important here?

num3ri
  • 822
  • 16
  • 20
doptimusprime
  • 9,115
  • 6
  • 52
  • 90

1 Answers1

0

Try adding these three lines of code right at the beginning of the procedure:

import sys
reload(sys)
sys.setdefaultencoding('utf8')

I also had your problem: that is due to the presence of accented letters (or non-ascii characters) in the strings: it seems that your variable "item["text"]" is utf-8 encoded. I tried to use various encoding and decoding methods proposed by some libraries. The only solution that proved to be effective, in my case, is the one I pointed out to you. I hope it is for you too.

num3ri
  • 822
  • 16
  • 20
  • 1
    This is a bad idea. Please don't promote it – see [here](https://stackoverflow.com/q/3828723) for some background on the topic. A clean alternative is setting the `PYTHONIOENCODING` environment variable instead. – lenz Jan 31 '19 at 09:30