Python 'utf8' codec can't decode byte 0xc3 in position 72: invalid continuation byte

Question

I am crawling a particular url from google.com but i get some error

'utf8' codec can't decode byte 0xc3 in position 72: invalid continuation byte

Code:

import re
import os
import MySQLdb
import codecs
import requests
import base64
import random
import gzip
import time
from multiprocessing.pool import Pool
import datetime
import time

import sys
reload(sys)
sys.setdefaultencoding('utf-8')
def proxy_mesh():
    while True:
        try: 

            data = requests.get('google.com')

            print data.text.encode('utf-8')
        except Exception, e:
            print e
            print "Trying again"
            time.sleep(3)
proxy_mesh()

What is the FIX and how to over come this error?

In other words, you're trying to decode using `utf-8` while the encoding was done differently. — Leb, Mar 23 '16 at 01:33
Can you give the traceback? This could be occurring implicitly in several places. — ShadowRanger, Mar 23 '16 at 01:37
@Mounarajan as suggested in the link I provided, you need to use different encoding. Can't tell you which one without more information. — Leb, Mar 23 '16 at 01:41

Mark Tolonen · Accepted Answer · 2016-08-17T08:25:19.370

0

Keep it simple and it works. The data has already been decoded by the requests module.

import requests
data = requests.get('https://www.whoisxmlapi.com/whoisserver/WhoisService?domainName=http://N%E2%94%9CO-RESPONDER@MERCAOLIVRE.COM&outputFormat=json')
print data.text

Since it is a JSON response, you may also want to process it:

import json
print json.loads(data.text)

edited Aug 17 '16 at 08:25

answered Mar 23 '16 at 01:46

Mark Tolonen

166,664
26
169
251

Python 'utf8' codec can't decode byte 0xc3 in position 72: invalid continuation byte

1 Answers1