2

I'm trying to use the requests module to log in to a website. I'm not sure what to reference in the html form to post the username and password. Here is the form I'm trying to use to post with to log in:

<div class="login-box contents" id="login">
                        <!--<div class="login-instruction">
                            <label class="fl-label"> Enter your information below to login. </label>
                        </div>-->
                        <div class="login-username">
                            <label for="username" class="fl-label">Username: </label>
                            <div class="clearboth"></div>


                            <input id="proxyUsername" name="proxyUsername" class="required" tabindex="1" maxLength="100" type="text" value="" onChange="remove_Error()" autocomplete="off"/>

                        </div>
                        <div class="float-right">
                            <b><input type="checkbox" id="proxyRememberUser" name="proxyRememberUser" tabindex="-1" value="checked">&nbsp;Remember Username</input></b>
                        </div>
                        <br/>
                        <div class="login-password">
                            <label for="password" class="fl-label">Password: </label>
                            <div class="clearboth"></div>

                            <input id="proxyPassword" name="proxyPassword" class="required" tabindex="2" maxLength="50" type="password" value="" onChange="remove_Error()" autocomplete="off" />
                        </div>

I'm trying to figure out where/how in the form I tell it to put the username and password. So in the code below, the keys for the payload_login variable are not correct:

import requests


username = raw_input('Please enter your username: ')
password = raw_input('Please enter your password: ')

payload_login = {
    'Username': username,
    'Password': password
}

with requests.Session() as s:
    con = s.post('somewebsite.com', \
       data=payload_login)
Jon Clements
  • 138,671
  • 33
  • 247
  • 280
Steven Werner
  • 165
  • 1
  • 3
  • 11
  • 2
    Have you tried using the proper field names (`"proxyUsername"` and `"proxyPassword"`) instead of `"Username"` and `"Password"`? – hlt Aug 18 '14 at 23:46
  • Yes, it wasn't working. Would you normally reference the value of an attribute like that? – Steven Werner Aug 19 '14 at 05:24

2 Answers2

3

As @hlt have commented, you must name field the same, as they are named in the form.

Also server may validate "Remember Username" checkbox, so it is better to include it in your request.

payload_login = {
    'proxyUsername': username,
    'proxyPassword': password,
    'proxyRememberUser': true
}

If this does not work for you, it means what site send auth-data is different way. For example, some JS-script may add hidden data in request, or encode some fields.

To find it out, you need to search this HTTP-request in your Browser's Developter Panel or in a external HTTP-sniffer (like Fiddler).

uhbif19
  • 3,139
  • 3
  • 26
  • 48
  • It's still not working, so I think you're correct that there is some other type of encoding going on here, which is not surprising. Thanks for your help – Steven Werner Aug 19 '14 at 19:55
  • I'm looking at the Chrome Dev Panel. Problem is I don't know what to look for or what to do about it. Are there any resources on this subject you can direct me to? Under the Network > Headers tab, I can see my username and password. The full pw is visible. Under Request Headers, it says: Content-Type: application/x-www-form-urlencoded. Is this relevant? Sorry for the vague/giant question. – Steven Werner Aug 19 '14 at 20:43
  • 2
    In the network tab, you can see the list of all requests performed on this page. Then you choose the authorisation request, you need to research - HTTP method, POST params (and also query params on HTTP payload, depending of request) and Headers. They may contain some secret keys. You need to add valuable headers and data from the request to your code. – uhbif19 Aug 19 '14 at 23:36
  • @StevenWerner Please share this request on PasteBin. To do that you need right-click on necessary request, and choose "Save as HAR" (as for Google Chrome DevTools, FireBug has similar options). This JSON contains all the information about request. – uhbif19 Aug 19 '14 at 23:39
  • Sorry for my delayed response. I was hesitant to post this, but I blocked out the login info, so I think it's safe. This PasteBin will expire in 24 hours, so let me know if you'd like me to repost. Any help you can offer is greatly appreciated: http://pastebin.com/KPvSRkzZ – Steven Werner Aug 25 '14 at 20:54
  • @StevenWerner Sorry, but your pastebin is private, so I cant read it. – uhbif19 Aug 26 '14 at 17:44
  • I have eject the data we need: http://pastebin.com/rEcUc1U5 As you can see form post a lot of fields, but the user and password. Most of them are static and you need to add them to your *payload_login* dictionary. Besides, it is necessary to check if some of them are dynamic and change from request to request. – uhbif19 Aug 27 '14 at 21:29
  • 2
    To find out thich of parms are important, you may use tool like this http://hurl.it. – uhbif19 Aug 27 '14 at 21:34
1

You will need to add the username and password as authentication header to the request. You can find more details here: http://docs.python-requests.org/en/latest/user/advanced/

You could simply use s.auth = (username, password). Thats the easiest way to implement it. But if you want to add it into the header yourself, you will first have to build the header. The authorization header contains the username and password which need to be b64encoded. For example:

[In python3]

from base64 import b64encode
import requests

username = input('Please enter your username: ')
password = input('Please enter your password: ')

authHandler = '{0}:{1}'.format(username, password).encode()
authHeader = {'Authorization' : 'Basic {0}'.format(b64encode(authHandler).decode("ascii"))}
with requests.Session() as s:
    con = s.post('somewebsite.com', headers=authHeader)

[In python2.7]

from base64 import b64encode
import requests

username = raw_input('Please enter your username: ')
password = raw_input('Please enter your password: ')

authHandler = '{0}:{1}'.format(username, password)
authHeader = {'Authorization' : 'Basic {0}'.format(b64encode(authHandler))}
with requests.Session() as s:
    con = s.post('somewebsite.com', headers=authHeader)
ashwinjv
  • 2,787
  • 1
  • 23
  • 32
  • Thank you. I'm unclear how to use `s.auth = (username, password)` to log in. Do I declare that and then make a get request to the login URL? – Steven Werner Aug 19 '14 at 17:26
  • @Ashwin Your solution uses HTTP Auth. Why did you think that target-site implement it and will parse HTTP Auth Header? Most of modern sites don't do that. – uhbif19 Aug 19 '14 at 18:07
  • @StevenWerner yes. After you define the sessions object, you set its headers. By using s.auth you are setting the HTTP auth header for that session. Then every request in that session would use this authentication header if required. The way RFC defines the communication for authentication is this: 1) a get request is set to a server 2) the server responds with a 401, and details of the realm and kind of authentication 3) If there are authentication headers defined for that type (In the example here its Basic authentication) the request then is resent with the authentication header. – ashwinjv Aug 19 '14 at 18:17
  • @uhbif19 I agree that there are other modes of Authentication like Digest, Oath, Oath2. But Basic is still the most common one as far as I know, and hence I used it in the example. @ StevenWerner: For Other types of auth you can use requests libraries as well from here http://docs.python-requests.org/en/latest/user/authentication/ – ashwinjv Aug 19 '14 at 18:22
  • @Ashwin Most of sites does not use any type of HTTP Auth at all (except OAuth, which is usually used as an additional login method). They just provide login-form, and return some session-data, to a client, which have successfully logged-in. – uhbif19 Aug 19 '14 at 18:35
  • @uhbif19 I thought that session data is one of two types, cookie, or auth header. If you have either then it means you are logged on. Most form posts use basic auth. http://docs.oracle.com/javaee/1.4/tutorial/doc/Security5.html – ashwinjv Aug 19 '14 at 18:40
  • 1
    @uhbif19 I take that back, more research is telling me that form based posts dont use http headers. Looking into it. Thanks – ashwinjv Aug 19 '14 at 18:43