0

I am working on a web scraping project and chosen google app engine to host the script .

I am stucked in sessions . I have written the script using request module but since google appengine doesn't allow to use requests module so I've tried this

from google.appengine.api import urlfetch
r = urlfetch.fetch("http://www.google.com")
data = { 'query' : 'search_query' }
r = urlfetch.fetch("http://www.google.com/search",headers=r.headers,method=urlfetch.POST,payload=data)

I have just given a example of what i trying to accomplish but it seems not working with cookies . The cookies get changed .

I got to know this because the B value get changed from B=ff09f69f244d862c494236ed3d0c6e1c; to B=8b18c48e19c0352b52903c49c45a41d1; which is not meant to be changed.

dunstorm
  • 372
  • 6
  • 13
  • You can install the requests package. You need to add it to your app files (say in a 'libs' directory), then it will upload when you deploy. Where are you setting the r.headers? – GAEfan Sep 02 '16 at 15:28
  • It's throwing TypeError: expected httplib.Message, got . this error when i try to include requests module . – dunstorm Sep 02 '16 at 15:55
  • 1
    Yes, others have reported that. I have not used requests urllib3, only use requests inside rauth. You should try `urllib2.urlopen` method of scraping. – GAEfan Sep 02 '16 at 16:15
  • That's the only option ? – dunstorm Sep 02 '16 at 16:22
  • 1
    read these: http://stackoverflow.com/questions/9604799/can-python-requests-library-be-used-on-google-app-engine?lq=1 https://github.com/shazow/urllib3/issues/618 – GAEfan Sep 02 '16 at 16:30
  • Thanks a lot bro i really appreciate this post it as answer and i will accept this :) – dunstorm Sep 02 '16 at 16:49
  • I don't have billing enabled ! Is there any other option :) – dunstorm Sep 02 '16 at 17:02
  • Well, you can enable billing, but set a very low budget. That enables many features. – GAEfan Sep 02 '16 at 17:03
  • I tried requests 0.7.3 and it worked ! Thanks – dunstorm Sep 03 '16 at 11:05

0 Answers0