0

I am trying to programmatically log into this site using the Python Requests library:

login_url='https://login.drexel.edu/cas/login?service=https://one.drexel.edu/c/portal/login'

login_payload = {                                                         
    'username':'lk12',                                                    
    'password':'1q2w3e4r5t6y'                                                 
}

s = requests.post(login_url, data=login_payload)    

print s.text

Note, the username and password key was retrieved from the page source's username and password id field.

Upon running the script and looking at the output of s.text, it appears I was never logged in despite my login credentials being valid. The output of s.text is simply the login page.

Using Firefox, I checked what the cURL request looks like:

curl 'https://login.drexel.edu/cas/login?service=https%3A%2F%2Fone.drexel.edu%2Fc%2Fportal%2Flogin' -H 'Host: login.drexel.edu' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:35.0) Gecko/20100101 Firefox/35.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'Referer: https://login.drexel.edu/cas/login?service=https%3A%2F%2Fone.drexel.edu%2Fc%2Fportal%2Flogin' -H 'Cookie: JSESSIONID=B3DD2F8E144D98090AC63B995D9030AB; IDMSESSID=lk12' -H 'Connection: keep-alive' --data 'username=lk12&password=1q2w3e4r5t6y&lt=LT-745860-yMF7kyKD3SSVfeUmXPiDLWQobbczCq&execution=e2s1&_eventId=submit&submit=Connect'

That request looks exactly like my Python request. Not quite sure what I am doing wrong

theGreenCabbage
  • 5,197
  • 19
  • 79
  • 169

1 Answers1

1

Someone can probably help you more, but here are a few things you can look at meanwhile:

  1. Set your user agent to an actual browser. If you don't, it's set to "Python", and some(many?) websites can have that user-agent blocked by default. Test that by sending your script to: http://whatsmyuseragent.com/

How to set user-agent: Sending "User-agent" using Requests library in Python

  1. If you plan on keeping the login session to make other requests, you should use session()

Python Requests and persistent sessions

  1. Check the html for hiddin fields in the form. Looks like there are a few. And you may have to send those too. Looks like LT parameter may be an authentication token.

If that's true, you need to GET the loging page first, read the content, parse the page to extract the token string, and add that to your payload. Notice those are being sent over through your browser request. If you match all values sent, it should work. (except for taken value which is likely dynamically generated on each type you request the login page).

HIDDEN FIELDS

<input type="hidden" name="lt" value="LT-392592-vZcTMdl2wvqcC5WEBcW9foIIJiLoWz" />
<input type="hidden" name="execution" value="e2s1" />
<input type="hidden" name="_eventId" value="submit" />
Community
  • 1
  • 1
gtalarico
  • 4,409
  • 1
  • 20
  • 42
  • Thank you so much for this question! For your third point, are you suggesting first GET the login page to get the `lt`, `execution`, and `_eventId` values first before sending my POST request? – theGreenCabbage Jun 03 '15 at 17:10
  • Hi @gtalarico. I just tried to use the cURL that Firefox generates when it logs in, and ran it. Even that (that contains all the necessary user agents, hidden params, etc) didn't log me in. When printing the content of the curl using `print $content`, the page was still the login page where it asks for your username and password. I also tried a custom Python script with the hidden params, user agent, etc as well, and the same thing. – theGreenCabbage Jun 03 '15 at 17:30
  • Are you sure it's sending the unique 'lt' parameter? If you just copy/past it won't work because it's generated after each GET request. Are you getting the token from the .content after the GET? Something like this: [link](http://stackoverflow.com/questions/13567507/passing-csrftoken-with-python-requests) Also, another thing I remember, you may need to set the referrer URL? [link](http://stackoverflow.com/questions/20837786/changing-the-referrer-url-in-python-requests) I used Live HTTP Headers to this to work with a few different sites, but perhaps there could be somethingI don't know. – gtalarico Jun 03 '15 at 19:11
  • Hi gta. Thanks for the great information - that's probably what I am missing. The link you sent me for the csrf token, seems to be a cookie thing. The `lt` value seems to be a value from the hidden form, so I probably have to scrape that value. – theGreenCabbage Jun 03 '15 at 22:41
  • That is correct @theGreenCabbage. Yes, that's correct. You can get the .content of the login page, and scrape the page (beautiful soup?) to retrieve the lt value, and then add that to the payload. Good luck! – gtalarico Jun 03 '15 at 23:10
  • Aaaand I'm in. Thank you so much! – theGreenCabbage Jun 03 '15 at 23:31