I'm starting to learn how to use the python requests module. For practicing I tried to manage a challenge/response problem: I want to access the data on http://lema.rae.es/drae/srv/search?val=hacer
With the "Tamper Data" plugin for Firefox I inspected the necessary HTTP requests:
GET http://lema.rae.es/drae/srv/search?val=hacer
POST http://lema.rae.es/drae/srv/search?val=hacer
I copied the exact headers that are sent by Firefox in the two HTTP requests and implemented the JavaScript "challenge" function in Python. Then I'm doing the following:
url = "http://lema.rae.es/drae/srv/search?val=hacer"
headers = { ... }
r1 = requests.get(url=url, headers=headers)
html = r1.content.decode("utf-8")
formdata = challenge(html)
headers = { ... }
r2 = requests.post(url=url, data=formdata, headers=headers)
Unfortunately, the server will not answer in the expected way. I checked all the headers I'm sending via "r.request.headers" and they agree perfectly with the headers that firefox sends (according to Tamper Data)
What am I doing wrong?
You can inspect my full code here: http://pastebin.com/7JAZ9B4s
This is the response header I should be getting:
Date[Tue, 10 Feb 2015 17:13:53 GMT]
Vary[Accept-Encoding]
Content-Encoding[gzip]
Cache-Control[max-age=0, no-cache]
Keep-Alive[timeout=5, max=100]
Connection[Keep-Alive]
Content-Type[text/html; charset=UTF-8]
Set-Cookie[TS014dfc77=017ccc203c29467c4d9b347fb56ea0e89a7182e52b9d7b4a1174efbf134768569a005c7c85; Path=/]
Transfer-Encoding[chunked]
And this is the response header I really get:
Content-Length[5798]
Content-Type[text/html]
Pragma[no-cache]
Cache-Control[no-cache]