2

I'm trying to download files(PDFs) silently from a website with VBA. So far I login without issue entering UserName & Password on the initial screen, navigate to the reports page within the site, get my list of files successfully in a table. I get the URL of the file in question without issue. Here's where I hit a wall. I do download a file but get a security warning when i open it that I must be logged in to view it. I can simulate this warning by pasting a URL into any browser when I'm not logged in & they look the same. So I'm downloading but not authenticating.

The code just on the download issue:

Dim strCookie As String
Dim strResponse As String
Dim xobj As Object
Dim WinHttpReq As Object
Dim WinHttpReq2 As Object
Dim oStream As Object

' Set xobj = New WinHttp.WinHttpRequest
strDocLink = "https://atlasbridge.com" & strDocLink & "&RT=PREVMAIL"
Debug.Print strDocLink
' launch tab & goto url/doc
' try to download the link(this is the url of the file)
' strDocLink
Set WinHttpReq = CreateObject("WINHTTP.WinHTTPRequest.5.1")
strUrl = "https://atlasbridge.com/search/AgencyReports.aspx"
WinHttpReq.Open "GET", strUrl, False
WinHttpReq.Option(WinHttpRequestOption_EnableRedirects) = False
WinHttpReq.setRequestHeader "Referer", "https://atlasbridge.com/search/AgencyReports.aspx"
WinHttpReq.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko"
WinHttpReq.setRequestHeader "Connection", "keep-alive"
WinHttpReq.setRequestHeader "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
WinHttpReq.setRequestHeader "Accept-Language", "en-US,en;q=0.5"
WinHttpReq.Send
If WinHttpReq.Status = 200 Then
    strResponse = WinHttpReq.responseText
    Debug.Print strResponse
    strCookie = WinHttpReq.getResponseHeader("Set-Cookie") ' this only gets the cookie; cookie seems include the session id
    resp = WinHttpReq.getAllResponseHeaders
    ' resp = WinHttpReq.responseBody
    ' strCookie = WinHttpReq.getResponseHeader("Cookie") ' doesnt find the requested header
    Debug.Print strCookie
    Debug.Print resp
    End If
' then open second session & try to get document
Set WinHttpReq2 = CreateObject("WINHTTP.WinHTTPRequest.5.1")
WinHttpReq2.Open "GET", strDocLink, False
WinHttpReq2.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko"
WinHttpReq2.setRequestHeader "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
WinHttpReq2.setRequestHeader "Accept-Language", "en-US,en;q=0.5"
WinHttpReq2.setRequestHeader "Referer", "https://atlasbridge.com/search/AgencyReports.aspx"
WinHttpReq2.setRequestHeader "Connection", "keep-alive"
WinHttpReq2.setRequestHeader "Host", "atlasbridge.com:443" '
WinHttpReq2.setRequestHeader "Accept-Encoding", "gzip, deflate, br"
' WinHttpReq2.setRequestHeader "Transfer-Encoding", "chunked"
' doesnt like this one causes error on the .send
WinHttpReq2.setRequestHeader "Cache-Control", "private"
WinHttpReq2.setRequestHeader "Upgrade-Insecure-Requests", "1"
WinHttpReq2.setRequestHeader "Content-Type", "application/pdf"
WinHttpReq2.setRequestHeader "Cookie", strCookie
WinHttpReq2.Send
If WinHttpReq2.Status = 200 Then
    Set oStream = CreateObject("ADODB.Stream")
    oStream.Open
    oStream.Type = 1
    oStream.Write WinHttpReq2.responseBody
    oStream.SaveToFile "C:\Users\MyUserName\Desktop\DownloadedMail\atlasreportdownload.ashx.pdf", 1 ' 1 = no overwrite, 2 = overwrite
    oStream.Close
End If

I've tried a few different things, but I don't believe I'm getting the full cookie & session ID.

The cookie I get back in WinHttpReq.getResponseHeader("Set-Cookie") or getAllResponseHeaders looks like:

NSC_bumbtcsjehf.dpn_TTM_443_MCWT=ffffffffc3a00a0a000000000005e445a4a423660;Version=1;Max-Age=2400;path=/;secure;httponly

But when I use LiveHeaders in Firefox I see:

Cookie: ASP.NET_SessionId=z2e4adilfjgiyynx2mntnh1k; NSC_bumbtcsjehf.dpn_TTM_443_MCWT=ffffffffc3a00a0a000000000005e445a4a423660; AuthToken=0be22946-a97a-442e-bd93-c80f0c96a525; AtlasLastMessage=1173; lc_sso7549731=1546651094987; __lc.visitor_id.7549731=S1546651090.26728e19e6

But I can't seem to expose that full cookie with AuthToken & Session ID, etc. when I Debug.Print the response. Can someone point me in the right direction so I can test a variation on what I'm doing? Thank you in advance.

Update: The response headers from the first request:

 Cache-Control: private
 Date: Wed, 16 Jan 2019 22:04:54 GMT
 Content-Length: 164
 Content-Type: text/html; charset=utf-8
 Location: /default.aspx?err=Expired&dest=%2fhome.aspx
 Server: Microsoft-IIS/7.0
 Set-Cookie: ASP.NET_SessionId=mo0owzztbul5of0litxox5kx; path=/; secure; HttpOnly
 Set-Cookie: NSC_bumbtcsjehf.dpn_TTM_443_MCWT=ffffffffc3a00a1a45525d5f4f58455e445a4a423660;Version=1;Max-Age=2400;path=/;secure;httponly
 X-AspNet-Version: 4.0.30319
 X-UA-Compatible: IE=edge
 X-Powered-By: ASP.NET

I'm working on downloading the response body now.

Jim Carney
  • 87
  • 2
  • 10
  • Quick update. I just manually posted the full cookie into my code by capturing it in Fiddler & then let my code run & I downloaded the file perfectly. So It tells me most of the code is working but I need to get to the whole cookie with the session id, etc. So that's the crux of the question. – Jim Carney Jan 16 '19 at 20:26
  • 1
    Make sure there are no redirections. Take a look at [the example](https://stackoverflow.com/a/49102412/2165759) where `auth_token` can be retrieved from response headers with disabled redirections only. – omegastripes Jan 16 '19 at 20:45
  • Omegastripes, thank you. I added: "WinHttpReq.Option(WinHttpRequestOption_EnableRedirects) = False" right below my first get statement. I see more of the cookie now, I can get the session ID, but its not the complete info. I tried getting the responseBody but that had mainly gibberish. Should I change where my redirects = false line is processed? – Jim Carney Jan 16 '19 at 21:47
  • 1
    Please add to the answer what headers do you get after disabling redirects? Try to save `.responseBody` and open as .zip archive. – omegastripes Jan 16 '19 at 22:00
  • So the response body zip won't open & the If WinHttpReq.Status = 200 line to verify it's ready fails so it skips the download unless i remove that. Headers were added to my original post. – Jim Carney Jan 16 '19 at 22:15
  • 1
    As you can see after disabling redirection there is `Location` response header received which redirects you to expired error page. I guess you need to repeat the first request with the proper SessionId to fix the error. – omegastripes Jan 17 '19 at 06:48
  • Well I have solved the cookies. i removed the redirect disable. Both cookies are there but don't print on debug. I save all headers to an array & split them out. So to get the session id an authentication token I put strDocCookie = IE.Document.cookie at the top of my code after i login. That gets me everything. Then i just split the parts and rebuild them to look like the one that worked. However in my quest to get the cookies it seems like I've changed something in my code & it is not downloading successfully even when I feed it manually, so going thru that now to figure out what I changed. – Jim Carney Jan 17 '19 at 22:02
  • Small update. So by running fiddler concurrent to my code i see two cookies change from the home page to the reports page. I'm capturing one successfully & just need to capture the last one & I think i'll be there. so it does look like it'll take multiple get requests to get the last guy, but I think this has put me on the right path, thank you! – Jim Carney Jan 18 '19 at 20:12
  • Ok so my question remains. How do I get the right ASP session ID. I'm getting one but it's the wrong one compared to fiddler. I've disabled redirects. In your comment about repeating the request with the right session ID, how do I do that since its the Asp session ID I'm looking for. Thank you! – Jim Carney Jan 18 '19 at 21:34

0 Answers0