52

I keep getting UnicodeEncodeError when trying to print a 'Á' that I get from a website requested using selenium in python 3.4.

I already defined at the top of my .py file

# -*- coding: utf-8 -*-

the def is something like this:

from selenium import webdriver

b = webdriver.Firefox()
b.get('http://fisica.uniandes.edu.co/personal/profesores-de-planta')
dataProf = b.find_elements_by_css_selector('td[width="508"]')
for dato in dataProf:
        print(datos.text)

and the exception:

Traceback (most recent call last):
  File "C:/Users/Andres/Desktop/scrap/scrap.py", line 444, in <module>
    dar_p_fisica()
  File "C:/Users/Andres/Desktop/scrap/scrap.py", line 390, in dar_p_fisica
    print(datos.text) #.encode().decode('ascii', 'ignore')
  File "C:\Python34\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2010' in position 173: character maps to <undefined>

thanks in advance

Andrés Fernández
  • 3,045
  • 2
  • 19
  • 21

1 Answers1

180

Already figured it out. As it is noted in this answer, the encoding error doesnt come from python, but from the encoding that the console is using. So the way to fix it is to run the command (in windows):

chcp 65001

that sets the encoding to UTF-8 and then run the program again. Or if working on pycharm as I was, go to Settings>Editor>File Encodings and set the IDE and Project encodings accondingly.

Community
  • 1
  • 1
Andrés Fernández
  • 3,045
  • 2
  • 19
  • 21
  • 8
    a million upvotes. it was the console not python. – AwokeKnowing Mar 02 '16 at 08:09
  • even if I changed pyCharm encoding it still giving me the same error(I solved it with windows console , but I'm not able to work inside pycharm any more) – Soorena Oct 15 '16 at 19:55
  • 1
    Always caught this error. Awesome solution. Millions upvotes for it!!! – Vitali Nov 16 '16 at 21:27
  • 4
    Also, in Command Prompt run: setx PYTHONIOENCODING utf-8 ... Then, **restart Command Prompt** and you can type echo %PYTHONIOENCODING% to ensure this was set. Inside Python, import sys if you need to and then print sys.stdout.encoding and it should say utf-8. – Andrew Aug 10 '17 at 18:46
  • 'utf-8' is not recognized as an internal or external command, what to do for this @Andrew – Joyson Feb 23 '18 at 15:18
  • @Joyson Maybe this? https://stackoverflow.com/q/8320648/1599699 – Andrew Feb 23 '18 at 18:57
  • For those using python logging module, same problem may happen in windows (if display language is US English), either change your windows display language or use encoding parameter and set to utf-8 for file handler for logging. Example: handler = TimedRotatingFileHandler("somelog.log", when="midnight", interval=1, encoding="utf-8") – Gorkem Apr 18 '18 at 08:52
  • vscode terminal has the same issue - switched over to Cmder/ConEmu and it worked fine. You can add this in vscode workspace settings: ```"terminal.integrated.shellArgs.windows": ["/K", "chcp 65001"]``` – Smitty Jul 12 '18 at 01:58