class Crawl(webapp2.RequestHandler):
def get(self):
from google.appengine.api import urlfetch
url = "http://www.example.com/path/to a/page" #URL with a space
result = urlfetch.fetch(url)
self.response.write('url: %s' % (result.status_code)) ## Outputs 400
self.response.write(content) # Gives me 400 error page
We can't deny the fact that there are thousands of URLs that contain spaces. There is no way we can correct them one by one.
Why does urlfetch get 400 bad request error for this kind of URL which is perfectly accessible through the browser? How to overcome this?