I am trying to use python (3.9.1) and requests (2.25.1) to log in via Shibboleth and Duo two factor authentication (2FA). I have all of the appropriate credentials to do so and furthermore, routinely log in with a browser. Ultimately I am trying to automate some tasks that will happen after a successful login. Indeed, I was successfully able to log in via Shibboleth only and I posted an answer to an old question upon solving it. Now however, my institution is requiring Duo 2FA and I must rework this authentication part of my code. I almost have it working but am failing at nearly the last step. I have been using Chrome Developer Tools to track the flow and am using that as a guide for what requests to make and what information to pass in each request. In this post, I have replaced goodies I do not wish to share with "XXXXXXXXXXX".
The flow of requests using a session object is as follows. Note that I show the request response redirection history in italics and the status code in bold at the end of each request description. No italicized code means no redirection. All my code is below this text description.
- Request 1 - Request the ultimate final URL where I wish to grab data and begin my automation. 302, 302, 200
- Request 2 - The session handles a redirection to an IDP SAML based url and is expecting my username and password. 302, 200
- Request 3 - After successfully passing credentials, the session is passed off to duo. Duo checks my IDP authentication. 302, 200
- Request 4 - After successful IDP authentication, duo prompts my device for 2FA and I do receive this prompt. 200
- Request 5 - Ask the duo api for status. Duo answers with 'OK' and that a login request was 'pushed'. 200
- Request 6 - Ask the duo api again for status. Duo answers with 'Success' and provides the txid needed for last step of duo 2FA. 200
- Request 7 - Post the txid to the duo api. 200
- Request 8 - This url was given as the response to Request 2 and passed through duo as the parent. A successful request here will give me the RelayState and the SAMLResponse. I am successful at obtaining these two variables and yes, the SAMLResponse sometimes approaches 50,000 characters. 200
- Request 9 - Post the RelayState and SAMLResponse to Shibboleth. 500
- Request 10 - Finally, I would expect to arrive at my ultimate final destination URL but so far, I can't get past request 9.
The salient parts of the response text from request 9 are:
- "opensaml::FatalProfileException"
- "Status: urn:oasis:names:tc:SAML:2.0:status:Requester"
- "Sub-Status: urn:oasis:names:tc:SAML:2.0:status:AuthnFailed"
The requests session cookie jar seems to successfully grab all cookies that I see when digging through Chrome DevTools for all requests above except for request 8. In Chrome DevTools, this Response Header includes five instances of "Set-Cookie" and I only see one of them in the cookie jar. The one cookie I do observe is the "BIGipServer~prt_shib~pl_idpXXXXXXX" which was placed in the cookie jar after request #1. The missing cookies that I do not observe in the response to request 8 but I do observe in Chrome DevTools are "shib_idp_session_ss", "XXXXX-rememberme-XXXXXXXXX", "XXXX-optin-XXXXXXXXXX", and "shib_idp_session". I focus on these cookies because: (1) they are a difference between Chrome DevTools and my requests session and (2) in searching for the response text to the failed request 9 I see various posts around the web regarding cookie issues. Because I am restarting my session from scratch each time, I should not have any stale cookies or other active logins. Regarding the active logins, I can be logged in with firefox, chrome, and safari all at once with the same credentials.
Am I correct in focusing on these shib_session cookies in an effort to solve my problem? Should I be looking higher up in the request flow (e.g. prior to duo) for reasons I am not passed these cookies? If I am successfully receiving the RelayState and SAMLResponse, why am I also not receiving the shib_session cookies?
Lastly, I know http requests can be sensitive to headers. I am naively using the status-code as my indicator of getting the headers correct. Perhaps I am missing a header specification early on that will then prompt the IDP to give me the shib cookies?
Here is my code:
import re
import requests
## start requests session
s = requests.Session()
## Request 1
url1 = '[ ultimate target URL ]' # This is the ultimate target URL
r1 = s.get(url1)
## Request 2
url2 = r1.url # redirection url from r1 is next up and used as url2
cred = {'j_username': 'XXXXXXXXXX', 'j_password': 'XXXXXXXXXX', '_eventId_proceed' : 'Sign in'}
r2 = s.post(url2, data = cred)
## Request 3
ss3 = re.search('data-host="',r2.text)
ss4 = re.search('"\n data-sig-request="',r2.text)
data_sig_request = r2.text[ss4.span(0)[1]:ss5.span(0)[0]]
ss6 = re.search('data-post-action="',r2.text)
ss7 = re.search('"\n frameborder=',r2.text)
data_post_action = r2.text[ss6.span(0)[1]:ss7.span(0)[0]]
prnt = '[ idp url ]' + data_post_action
prnt = prnt.replace('/','%2F').replace(':','%3A').replace(';','%3B').replace('=','%3D').replace('?','%3F')
url3 = 'https://api-XXXXXXXXXX.duosecurity.com/frame/web/v1/auth?tx=' + data_sig_request + '&parent=' + prnt + '&v=2.6'
headers3 = {'Referer': '[ idp url ]'}
s.headers.update(headers3)
r3 = s.post(url3)
## Request 4 - duo prompt
url4 = 'https://api-XXXXXXXXXX.duosecurity.com/frame/prompt'
ss8 = re.search('value="',r3.text)
ss9 = re.search('">\n<input type="hidden" name="url" value="',r3.text)
sid = r3.text[ss8.span(0)[1]:ss9.span(0)[0]]
sid = sid.replace('=','=').replace('|','|')
ss10 = re.search('"ukey" value="',r3.text)
ss11 = re.search('">\n\n<input type="hidden" name="out_of_date"',r3.text)
formdata4 = {'sid':sid, 'device':'phone1', 'factor':'Duo Push'}
r4 = s.post(url4,formdata4)
## Request 5 - Duo Status
url5 = 'https://api-XXXXXXXXXX.duosecurity.com/frame/status'
sid5 = sid.replace('=','%3D').replace('|','%7C')
ss12 = re.search('"txid": "',r4.text)
ss13 = re.search('"}}',r4.text)
txid = r4.text[ss12.span(0)[1]:ss13.span(0)[0]]
formdata5 = {'sid':sid, 'txid':txid}
r5 = s.post(url5,formdata5)
## Request 6 - second status call to duo
url6 = url5
formdata6 = formdata5
r6 = s.post(url6,formdata6)
## Request 7 - Duo - evaluate push
url7 = 'https://api-XXXXXXXXXX.duosecurity.com/frame/status/' + txid
formdata7 = {'sid':sid}
r7 = s.post(url7,formdata7)
## Request 8
url8 = r2.url # r2.url is the redirection url from url2
ss18 = re.search('"\n data-post-action',r2.text)
sig_response = r2.text[ss4.span(0)[1]:ss18.span(0)[0]]
sig_response = sig_response.replace('TX|','AUTH|')
formdata8 = {'_eventId':'proceed', 'sig_response':sig_response}
r8 = s.post(url8,formdata8)
## Request 9 - should redirect to ultimate url (url1)
url9 = 'https://XXXXXXXXXXXX/Shibboleth.sso/SAML2/POST'
ss19 = re.search('name="RelayState" value="',r8.text)
ss20 = re.search('"/> \n \n <input type="hidden" name="SAMLResponse" value="',r8.text)
RelayState = r8.text[ss19.span(0)[1]:ss20.span(0)[0]]
RelayState = RelayState.replace(':',':')
ss21 = re.search('"/> \n </div>\n <noscript>\n <div>\n <input type="submit" value="Continue"/>',r8.text)
SAMLResponse = r8.text[ss20.span(0)[1]:ss21.span(0)[0]]
formdata9 = {'RelayState':RelayState, 'SAMLResponse':SAMLResponse}
r9 = s.post(url9,formdata9,allow_redirects=True)
## Request 10 - this would be the ultimate final destination URL and is url1 from above.