1

I am playing with the requests module in python and I am stuck with a problem.

I use requests to login on a website (http://coinplants.com) using the Session class. After the login I am trying to read the html of the page and I realized that the response object shows only the html body with it's content but not the html head. I would like to get the html head with the meta tags. Any idea what I am doing wrong?

s = requests.Session()
r = s.post('http://coinplants.com', data=postData)
print r.text

Thanks in advance :)

LOGIN

To scrap the authenticity token I use BeautifulSoup

soup = BeautifulSoup(r.text, 'lxml')
finding = soup.find('input', {'name' : 'authenticity_token'})
postData = {'utf8' : '%E2%9C%93', 'authenticity_token' : '',
        'account[email]' : self.username, 'account[password]' : self.password,
         'account[remember_me]' : '0', 'commit' : 'Log+in'}
postData['authenticity_token'] = finding['value']
r = s.post('http://coinplants.com/accounts/sign_in', data=postData)

Solution

Ok, I found a solution to my problem. I have no idea why the session doesn't give me the whole html content. I took the cookie from the session object and added it to a request object:

cookies = {'_faucet:session' : s.cookies['_faucet_session']}
r = requests.get('http://coinplants.com', cookies=cookies)
print r.text

s is the session object. When I print the text of the response object it shows me the whole html content, including head tag. If someone knows why the session object is not showing it, please let me know :)

Donut
  • 197
  • 9

2 Answers2

0

When i understand you right you are looking for the headers of the page.

when you type

print r.headers

you should get the headers of the page.

Or did i understand your question wrong?

This page is very helpfull to learn more about the request module. http://docs.python-requests.org/en/master/

Melody
  • 182
  • 2
  • 12
  • The r.headers returns the HTTP headers, which the server sends to me and not the HTML head tag. I guess you understood it wrong. But I will have a look on the link and hope to find there something. – Donut Jul 26 '16 at 09:04
  • Oh i'm sorry. but i hope the link can help you! :-) – Melody Jul 26 '16 at 09:07
  • [link](http://stackoverflow.com/questions/9554947/getting-head-content-with-python-requests) take a look to this question. – Melody Jul 26 '16 at 09:13
  • I just had a look on it. I read a bit about the Range header in HTTP and I tried to use it, but somehow I get the whole html body content, even I ask for 100 bytes... – Donut Jul 26 '16 at 09:36
-1

Print out the req.url which is getting and then try to scrap that url using get.

url = r.url
req = s.get(url)
print req.text

See is it resolving your issue or not. If not then go to the r.url in browser and inspect the element using whatever browser you are comfortable and see whether the head tag is showing or not. I hope it helped.

sumitroy
  • 448
  • 9
  • 20
  • This is not helping. url = r.url returns http://coinplants.com This is the url I am sending the GET request. Using FF I am having a look on the html and I see there a html head and html body tag and both with content. As soon I use requests I don't get the html head tag :( – Donut Jul 26 '16 at 08:44