I am using the same code to pickup web texts but most of the time it shows “WARNING:root:Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.”, and surprisingly sometime it works, for example I run the code 12 times, 1 time is successful.
Same code, same web address. Why is this happening?
from bs4 import BeautifulSoup
import re
import urllib2
url = "http://nz.sports.search.yahoo.com/search?p=basketball&fr=sports-nz-ss&age=1w&focuslim=age"
page = urllib2.urlopen(url)
soup = BeautifulSoup(page.read())
web_p = soup.find_all('span',class_='url')
for web in web_p:
print web
Trackback details like below:
Traceback (most recent call last):
File "C:\Python27\lib\idlelib\run.py", line 112, in main
seq, request = rpc.request_queue.get(block=True, timeout=0.05)
File "C:\Python27\lib\Queue.py", line 176, in get
raise Empty
Empty