1

top_100 is a mongodb collection:

the following code:

x=[]
thread=[]
for doc in top_100.find():
    x.append(doc['_id'])




db = Connection().test

top_100 = db.top_100_thread

thread = [a["thread"] for a in x]

for doc in thread:
    print doc

gives this error:

Traceback (most recent call last):
  File "C:\Users\chatterjees\workspace\de.vogella.python.first\src\top_100_thread.py",        line 21, in <module>
    print doc
  File "C:\Python27\lib\encodings\cp1252.py", line 12, in encode
   return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode character u'\u03b9' in position 10:      character maps to <undefined>

what's going on?

Thiem Nguyen
  • 6,345
  • 7
  • 30
  • 50
codious
  • 3,377
  • 6
  • 25
  • 46
  • 1
    Not related to your question, but I would write your first for-loop as a list comprehension `x = [doc['_id'] for doc in top_100.find()]` – Daan Timmer Apr 11 '12 at 12:35
  • @Daan Timmer What is a good source to learn these nuances of Python? – codious Apr 12 '12 at 07:26
  • Well, you already used a list comprehension on your `thread = [...]` line. So you either already know how it works or you are good at copy-pasting. Best way that I've learned to find these nuances is by writing code. Then iterating over it and looking for for-loops. If found, see if I can minimalize/shorten it by writing pythonic code. Didn't really use a source/book/website for it. [this is however a really nice source](http://www.dabeaz.com/generators/Generators.pdf) of some smart-python-usage. – Daan Timmer Apr 12 '12 at 07:30
  • thanks, I understood that part of the code but missed that its part of list comprehension. – codious Apr 12 '12 at 07:37

1 Answers1

1

Its because your document contains some unicode data. You need to correctly output unicode data instead of directly printing it. see: python 3.0, how to make print() output unicode?

Community
  • 1
  • 1
DhruvPathak
  • 42,059
  • 16
  • 116
  • 175