65

I'm trying to login a website for some scraping using Python and requests library, I am trying the following (which doesn't work):

import requests
headers = {'User-Agent': 'Mozilla/5.0'}
payload = {'username':'niceusername','password':'123456'}

In [12]: r = requests.post('https://admin.example.com/login.php',headers=headers,data=payload)

But nada, getting a redirect to the login page. Do I need to open a session? am I doing a wrong POST request, do I need to load the cookies? or does session does that automatically? I am lost here, some help and explanations are needed.

The website I'm trying to login is php, do I need to "capture the set-cookie and set the cookie header"? if so I have no idea how to do it. The webpage is a form with the following (if it helps): input :username' 'password' 'id':'myform', 'action':"login.php

Some extra information, maybe you can see what I'm missing here..

In [13]: r.headers
Out[13]: CaseInsensitiveDict({'content-encoding': 'gzip', 'transfer-encoding': 'chunked',
 'set-cookie': 'PHPSESSID=v233mnt4malhed55lrpc5bp8o1; path=/',
  'expires': 'Thu, 19 Nov 1981 08:52:00 GMT', 'vary': 'Accept-Encoding', 'server': 'nginx',
   'connection': 'keep-alive', 'pragma': 'no-cache',
    'cache-control': 'no-store, no-cache, must-revalidate, post-check=0, pre-check=0',
     'date': 'Tue, 24 Dec 2013 10:50:44 GMT', 'content-type': 'text/html'})

In [14]: r.cookies
Out[14]: <<class 'requests.cookies.RequestsCookieJar'>[Cookie(version=0, name='PHPSESSID',
 value='v233mnt4malhed55lrpc5bp8o1', port=None, port_specified=False, domain='admin.example.com',
  domain_specified=False, domain_initial_dot=False, path='/', path_specified=True, secure=False,
   expires=None, discard=True, comment=None, comment_url=None, rest={}, rfc2109=False)]>

I would really appreciate the help, thanks!

update, with answer thanks to atupal:

    import requests

headers = {'User-Agent': 'Mozilla/5.0'}
payload = {'username':'usr','pass':'123'}
link    = 'https://admin.example.com/login.php'
session = requests.Session()
resp    = session.get(link,headers=headers)
# did this for first to get the cookies from the page, stored them with next line:
cookies = requests.utils.cookiejar_from_dict(requests.utils.dict_from_cookiejar(session.cookies))
resp    = session.post(link,headers=headers,data=payload,cookies =cookies)
#used firebug to check POST data, password, was actually 'pass', under 'net' in param.  
#and to move forward from here after is:
session.get(link)
Alessandro
  • 900
  • 12
  • 23
Captain_Meow_Meow
  • 2,341
  • 5
  • 31
  • 44
  • I've tried your solution but didn't work in my case. Could you please take a look? Thanks https://stackoverflow.com/questions/58248578/python-requests-module-to-verify-if-http-login-is-successful-or-not –  Oct 05 '19 at 14:50
  • Thanks for this - the cookie jar etc link integration was necessary for my use case - Strava data grabber beyond what the current API provides, in case anyone is searching for this in future. – Hugh Nolan Jun 05 '22 at 11:02
  • Most of the below answers don't encode the password, so are insecure. A more modern approach would use `requests.post(authEndPoint, auth=requests.auth.HTTPBasicAuth(userName, password))` – MarkHu May 22 '23 at 23:57

3 Answers3

95

You can use the Session object

import requests
headers = {'User-Agent': 'Mozilla/5.0'}
payload = {'username':'niceusername','password':'123456'}

session = requests.Session()
session.post('https://admin.example.com/login.php',headers=headers,data=payload)
# the session instance holds the cookie. So use it to get/post later.
# e.g. session.get('https://example.com/profile')
daedalus
  • 10,873
  • 5
  • 50
  • 71
atupal
  • 16,404
  • 5
  • 31
  • 42
  • Hi atupal, i tried your answer, did resp=session.post(....). the resp.content tells that it's on the same login page. – Captain_Meow_Meow Dec 24 '13 at 11:51
  • @user2627775 That means you haven't logined successfully. Have you used firbug or other pacp tools capture package to determin what's data it sends? – atupal Dec 24 '13 at 12:03
  • i will check it now :) – Captain_Meow_Meow Dec 24 '13 at 12:08
  • it says: Parameters application/x-www-form-urlencoded pass 1234567 username usernamename Source username=1234567&pass=usernamename – Captain_Meow_Meow Dec 24 '13 at 12:22
  • 2
    @user2627775 So the param name you pass is not "password" but should "pass"? just `{'username':'xx', 'pass': 'yy'}` – atupal Dec 24 '13 at 12:25
  • yes indeed. I know my question was simple and yet thanks allot atupal!! you made my day (after lots of tries), you got me on the right direction and now i know more than i knew before. so thanks again! – Captain_Meow_Meow Dec 24 '13 at 13:02
  • Thanks @atupal. I'm having similar problem here could you please take a look? https://stackoverflow.com/questions/58248578/python-requests-module-to-verify-if-http-login-is-successful-or-not –  Oct 05 '19 at 14:56
20

Send a POST request with content type = 'form-data':

import requests
files = {
    'username': (None, 'myusername'),
    'password': (None, 'mypassword'),
}
response = requests.post('https://example.com/abc', files=files)
HoangYell
  • 4,100
  • 37
  • 31
8

I was having problems here (i.e. sending form-data whilst uploading a file) until I used the following:

files = {'file': (filename, open(filepath, 'rb'), 'text/xml'),
         'Content-Disposition': 'form-data; name="file"; filename="' + filename + '"',
         'Content-Type': 'text/xml'}

That's the input that ended up working for me. In Chrome Dev Tools -> Network tab, I clicked the request I was interested in. In the Headers tab, there's a Form Data section, and it showed both the Content-Disposition and the Content-Type headers being set there.

I did NOT need to set headers in the actual requests.post() command for this to succeed (including them actually caused it to fail)

bdfariello
  • 113
  • 1
  • 5