2

I keep getting an error when trying to display content from an rss feed. the feeds I have tried are the Teksyndicate "kitchen sink" feed(utf-8) and the AMD news feed(encoding not set), both downloaded to my computer so I do not ping their servers evry time i run the code.

teksyndicate feed gives me 'UnicodeEncodeError: 'charmap' codec can't encode character u'\xc2' in position 183: character maps to '

amd feed gives me 'UnicodeEncodeError: 'charmap' codec can't encode character u'\u2122' in position 349: character maps to ' the code throwing the error:

import xml.etree.ElementTree as ET
xmlTree = ET.parse('amd.rss')
xmlRoot = xmlTree.getroot()
# <tag attrib>text<child/>...</tag>tail
# above pulled from Element tree lib file
for i in list(xmlTree.iter()):
    if i.text != None:
        print i.tag + ': ' + i.text
    else:
        print i.tag + ': None'
print '\n\nxmlRoot'
print xmlRoot.tag
print xmlRoot.attrib
print xmlRoot.text
print xmlRoot.tail

Just a additional note, I am trying to make an rss feed reader. I know there are ones out there, but I want to make my own just to give it a shot. That is when I ran into this error, and I have no Idea how to fix it. At this point I'm just goofing off trying to learn ElementTree.

Termanater13
  • 45
  • 1
  • 8
  • 1
    What kind of thing is stdout connected to? Sounds like your terminal can't handle unicode. – Wooble Aug 08 '14 at 17:28
  • The full traceback error message might have been helpful. – holdenweb Aug 08 '14 at 17:29
  • try it in idle(idle supports unicode) .... a fully runnable example (ie the call to the webservice to get the rss) would be more helpful ... what characters are those supposed to be (u"\x2c" looks like some kind of comma?) and the other one is the trademark symbol ? – Joran Beasley Aug 08 '14 at 17:33
  • possible duplicate of [Python Unicode Encode Error](http://stackoverflow.com/questions/3224268/python-unicode-encode-error) – holdenweb Aug 08 '14 at 17:33
  • ended up find the solution my self with 'str(unicode(fileCon, errors='ignore'))' where fileCon is the contents of the file. The other things linked to for the solution did not fix the issue, I had to eperament with concepts from them. So they did not have the solution but lead to it. – Termanater13 Aug 13 '14 at 18:37

1 Answers1

0

The print statement tries to represent everything as strings. You question essentially duplicates this one here. I found it with Google, as should you have!

The problem is that unless you specify and encoding ASCII will be used, and many Unicode characters can't be converted to ASCII (which only has 128 characters in it). The answers given to the other question should tell you what to correct.

Community
  • 1
  • 1
holdenweb
  • 33,305
  • 7
  • 57
  • 77
  • I have been looking and everything I have tried keeps poping up a differnt error. The only difference is the charicter code being displayed. I have tried to encode it to a spacific kind like utf-8, unicode, and ascii, and even decode it and I keep getting errors. – Termanater13 Aug 13 '14 at 18:04