117

I am trying to parse xml which contains the some non ASCII cheracter,

the code looks like below

from lxml import etree
from lxml import objectify
content = u'<?xml version="1.0" encoding="utf-8"?><div>Order date                            : 05/08/2013 12:24:28</div>'
mail.replace('\xa0',' ')
xml = etree.fromstring(mail)

but it shows me error on the line 'content = ...' like

syntaxError: Non-ASCII character '\xc2' in file /home/projects/ztest/responce.py on line 3, 
but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

in the terminal it's working but while running on the eclipse IDE it's giving me a error.

Don't know how to overcome..

alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
OpenCurious
  • 2,916
  • 5
  • 22
  • 25
  • 6
    I don't think it a duplicate. People encounter this python encoding issue very often. Having this rich style of problem descriptions in SO makes our knowledge base better. – DehengYe Aug 15 '15 at 13:48
  • 4
    YOU WILL LIKELY GET THIS ERROR if you import a PYTHON 3 file into the PYTHON 2 interpreter. *(This question should not be closed - '\xc2' is a very particular sort of problem - and very different to that raised by the supposed duplicate question. The answer should be made clear here).* – markling Oct 25 '21 at 11:31

1 Answers1

282

You should define source code encoding, add this to the top of your script:

# -*- coding: utf-8 -*-

The reason why it works differently in console and in the IDE is, likely, because of different default encodings set. You can check it by running:

import sys
print sys.getdefaultencoding()

Also see:

Community
  • 1
  • 1
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • 1
    If I do not include this line then python3 does not throw the error. However, python2 does. The only way to make it work with python2 is to add this line `# -*- coding: utf-8 -*-`. but why? – seralouk Oct 03 '19 at 20:47