1

I want to extract the following items from a single website response to an Excel sheet:

  1. Value of one of the cookies.
  2. A value of an ID from the body of the response.

Cookie value to capture:

enter image description here

ID value to capture from HTML body:

enter image description here

I have searched for the solution, but I can find a way to pull a cookie separately with a different code and the id value separately from the HTML response body through another code. However, combining the codes doesn't work as I need to use the same cookie value and the id value from the response in the subsequent post request.

To make the flow easier to understand, I will summarise my expectation below:

  1. Visit "Site 1" grab the "Cookie" value and unique "ID" value from the response.
  2. Pass the two values received in the previous response to the request of "Site 2".
  3. Grab the link from the response of "Site 2" and visit "Site 3".

The code I have used to receive cookie values and HTML body content, which throws an error if uncommenting the codes to pull HTML body content. Kindly let me know where I am making a mistake or try a new way. (I have tried different way around, so I have kept them as comments.)

    Sub Cookie_and_HTMLbody()
    
    Dim strCookie As Variant
    Dim strToken As Variant
    Dim Doc As Object
    Dim pontod As Object
    'Dim Elements As IHTMLElementCollection
    'Dim Element As IHTMLElement

    On Error Resume Next
    Set Doc = New HTMLDocument

    With CreateObject("WinHttp.WinHttpRequest.5.1")
    'With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", "https://example.com", False
        .setRequestHeader "Upgrade-Insecure-Requests", "1"
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36"
        .setRequestHeader "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"
        .setRequestHeader "Sec-Fetch-Site", "none"
        .setRequestHeader "Sec-Fetch-Mode", "navigate"
        .setRequestHeader "Sec-Fetch-User", "?1"
        .setRequestHeader "Sec-Fetch-Dest", "document"
        .setRequestHeader "Accept-Encoding", "gzip, deflate"
        .setRequestHeader "Accept-Language", "en-US,en;q=0.9"
        .setRequestHeader "Connection", "close"
        .send
               
        Doc.body.innerHTML = .responseText
        Set pontod = Doc.getElementById("trialrequestlanding").getElementsByTagName("div")(1).getElementsByTagName("div")(1).getElementsByTagName("div")(1).getElementsByTagName("div")(1).getElementsByTagName("form")(1).getElementsByTagName("div")(1).getElementsByTagName("input")(1)
        strCookie = .getAllResponseHeaders
        'strCookie = .getResponseHeader("Set-Cookie:")
        'strCookie = Split(strCookie, "Set-Cookie:")
        'strCookie = Trim(strCookie(UBound(strCookie)))
        strCookie = Split(strCookie, vbCrLf)
        strCookie = Trim(Split(Split(strCookie(5), ";")(0), ":")(1)) & "; " & Trim(Split(Split(strCookie(6), ";")(0), ":")(1))
        MsgBox strCookie
        '.responseType = document
        'Doc = .responseText
        strToken = pontod.getAttribute("value")
        'strToken = Doc.querySelector("input[name='RequestVerificationToken']").getAttribute("value")
        'strToken = document.getElementsByTagName("input")
        'Set Doc = ie.document
        MsgBox strToken 

'Set Elements = .getElementsByTagName("input") 
'For Each Element In Elements 
'    If Element.ID = "RequestVerificationToken" Then         
'Range("c2").Value = Element.innerText 
'        MsgBox Element.Value 
'    End If 
'Next Element 

'Set Elements = Nothing 
        'Doc.Quit 
        'Set Doc = Nothing 
    End With 
End Sub

Another code that works for retrieving a value from the HTML body is given below.

Sub Generate_Email()
Dim Shell As Object
Dim i As Variant
Dim bie As Object
Dim ie As Object
Dim Doc As HTMLDocument
Dim Elements As IHTMLElementCollection
Dim Element As IHTMLElement

'Set ie = New InternetExplorerMedium
Set ie = CreateObject("InternetExplorer.Application")
'Set ie = GetObject("new:{D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E}")
'Set ie = New InternetExplorer

ie.Visible = False

ie.navigate "https://randomsite.com/"

Do
DoEvents
Loop Until ie.readyState = 4
'Do While ie.Busy Or ie.readyState <> 4
'DoEvents
'Loop

Set Doc = ie.document

Set Elements = Doc.getElementsByTagName("span")

For Each Element In Elements
    If Element.ID = "email_ch_text" Then
        Range("c2").Value = Element.innerText
    End If
Next Element

Set Elements = Nothing

ie.Visible = True

ie.Quit
Set ie = Nothing

Set objWMIService = GetObject("winmgmts:\\.\root\cimv2")

Set colItems = objWMIService.ExecQuery("Select * From Win32_Process")

On Error Resume Next

For Each objItem In colItems
    'msgbox objItem.name & " " & objItem.ProcessID & " " & objItem.CommandLine
    If objItem.Name = "ielowutil.exe" Then objItem.Terminate
Next

For Each objItem In colItems
    'msgbox objItem.name & " " & objItem.ProcessID & " " & objItem.CommandLine
    If objItem.Name = "iexplore.exe" Then objItem.Terminate
Next

End Sub

How to retrieve both the values using a single code?

UPDATE (02 May 2021):

I have rewritten the code that supports extracting cookie properly but has an issue with pulling the element attribute "value", as shown in image 2. Kindly help me to identify what mistake blocks me from extracting the element attribute in the below code.

Sub Test_Cookie_and_HTML()
        
    Dim pontod As Object
   
    Dim html As Object
        
    On Error Resume Next
    Set html = New HTMLDocument
    With CreateObject("WinHttp.WinHttpRequest.5.1")
        .Open "GET", "https://portswigger.net/burp/pro/trial", False
        .setRequestHeader "Upgrade-Insecure-Requests", "1"
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36"
        .setRequestHeader "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"
        .setRequestHeader "Sec-Fetch-Site", "none"
        .setRequestHeader "Sec-Fetch-Mode", "navigate"
        .setRequestHeader "Sec-Fetch-User", "?1"
        .setRequestHeader "Sec-Fetch-Dest", "document"
        .setRequestHeader "Accept-Encoding", "gzip, deflate"
        .setRequestHeader "Accept-Language", "en-US,en;q=0.9"
        .setRequestHeader "Connection", "close"
        .send
        html.body.innerHTML = .responseText
        
               
        Dim strCookie As String
        Dim sessionidCookie As String

        strCookie = .getResponseHeader("Set-Cookie")     ' --> "SESSIONID=40DD2DFCAF24A2D64544F55194FCE04E;path=/pamsservices;HttpOnly"
        sessionidCookie = GetsessionIdCookie(strCookie)       ' Strips to  "SESSIONID=40DD2DFCAF24A2D64544F55194FCE04E"

        MsgBox sessionidCookie
        MsgBox RequestVerificationToken
                
    End With
    
        Set pontod = html.getElementById("trialrequestlanding").getElementsByTagName("input")(1)
        MsgBox pontod.getAttribute("value")
End Sub
Yuthan
  • 41
  • 5
  • In the case of the first code have you confirmed whether the id is present in the response? It may be that this can only be done with multiple requests. – QHarr May 01 '21 at 15:23
  • You can remove a loop by using `Range("c2").Value = ie.document.querySelector("span#email_ch_text").innerText`. I would remove code that isn't required as it is noise (I am referring to all the commented out lines in particular). If that is part of an attempt to combine then put that as a 3rd code block, without commenting out lines used in your attempt. For the non working combined attempt, please explain what isn't working, detailing any error messages and where the occur. – QHarr May 01 '21 at 15:27
  • @QHarr Yeah, it is in the response body. No, multiple requests do not help because each cookie value is tied up with its id value in the response. If we provide a different ID value for a different cookie, then the subsequent request will become invalid. So, I need to find a way to retrieve both the cookie and the ID from the same response. – Yuthan May 01 '21 at 15:50
  • @QHarr removing the loop is a good idea; sure, I will try that out. Thank you for the suggestion. However, I need to figure out a code that serves both the purpose. – Yuthan May 01 '21 at 15:51
  • Your commented out querySelector for the id in first code looks good. What didn't work with that? – QHarr May 01 '21 at 16:18
  • You can't extract the cookie from the second method? https://stackoverflow.com/questions/9690947/how-can-i-access-cookie-from-internetexplorer-application or maybe look at this: https://stackoverflow.com/questions/38726408/retrieve-all-cookies-from-internet-explorer – Tim Williams May 01 '21 at 19:22
  • @QHarr the querySelector does not provide the blank result. – Yuthan May 02 '21 at 13:24
  • @TimWilliams nope, it does not help as the document.cookie provides the cookie from request and not from the response. – Yuthan May 02 '21 at 13:25
  • That's not my understanding of what `document.cookie` does, but sounds like you've already determined it doesn't meet your needs. – Tim Williams May 02 '21 at 23:43

0 Answers0