2

I was wondering if it is possible to "click a button" on a webpage without opening the page in IE. The webpage is dynamically generated and the click on the button calls a script that changes the content of the page.

I am able to do this opening Internet Explorer with this sub:

Sub DownloadPageScript(strUrl As String, htmlPage As htmlDocument, strScript As String)

  Dim IE            As Object

  Set IE = CreateObject("InternetExplorer.application")
  IE.navigate strUrl

  Do
      DoEvents
  Loop Until IE.ReadyState = READYSTATE_COMPLETE 

  ' Run the scripts associated to the button to get the data
  IE.Document.parentWindow.execScript strScript, "jscript"

  Do
      DoEvents
  Loop Until IE.ReadyState = READYSTATE_COMPLETE

  Set htmlPage = IE.Document

End Sub

But I would like to avoid opening Internet Explorer so I would like to something like this:

Sub Download_Page(strUrl As String, htmlPage As htmlDocument, strScript As String)

  Dim xmlHttp      As Object
  '
  Set xmlHttp = CreateObject("MSXML2.XMLHTTP")
  xmlHttp.Open "GET", strUrl, False
  xmlHttp.setRequestHeader "Content-Type", "text/xml"
  xmlHttp.send
  '
  ' Here I should add something to execute the script
  ' 
  ' After execution
  '
  Set htmlPage = New htmlDocument
  htmlPage.body.innerHTML = xmlHttp.ResponseText
  '
End Sub

I was expecting to find something like a xmlHttp.execute(args) method to replicate the action of clicking the button but I was wrong. So my question is: Is it possible to replicate the button click if I do not want to open Internet Explorer? and if yes what should I do?

New approach based on the idea in the comments

I tried the approach suggested by @omegastripes in the comments and I wrote this sub taken by his answer 33484763:

Sub TestDownload()

  Dim xmlHttp      As Object
  Dim htmlPage       As htmlDocument
  Dim strExportURL   As String
  Dim strFormData    As Variant
  Dim strContent     As String


    ' build exportURL parameter
  strExportURL = Join(Array( _
      "p_p_id=ScommesseAntepostPalinsesto_WAR_scommesseportle", _
      "p_p_lifecycle=2", _
      "p_p_resource_id=dettagliManifestazione", _
      "p_p_cacheability=cacheLevelPage", _
      "_ScommesseAntepostPalinsesto_WAR_scommesseportlet_codDisc=1", _
      "_ScommesseAntepostPalinsesto_WAR_scommesseportlet_codMan=21", _
      "_ScommesseAntepostPalinsesto_WAR_scommesseportlet_codScomm=3", _
      "_ScommesseAntepostPalinsesto_WAR_scommesseportlet_codClusterScomm=80", _
      "_ScommesseAntepostPalinsesto_WAR_scommesseportlet_filtro=0" _
      ), "&")

  ' build the whole form data
  strFormData = Join(Array( _
        "languageCode=en", _
        "exportURL=" & URLEncode(strExportURL) _
    ), "&")

  ' POST XHR to retrieve the content
  Set xmlHttp = CreateObject("Microsoft.XMLHTTP")
  xmlHttp.Open "POST", "http://www.sisal.it/scommesse-matchpoint/palinsesto", False
  xmlHttp.setRequestHeader "Content-Type", "application/json"
  xmlHttp.send strFormData

  Set htmlPage = New htmlDocument
  htmlPage.body.innerHTML = xmlHttp.responseText
  '
End Sub

Public Function URLEncode( _
   StringVal As String, _
   Optional SpaceAsPlus As Boolean = False _
) As String
  Dim bytes() As Byte, b As Byte, i As Integer, space As String

  If SpaceAsPlus Then space = "+" Else space = "%20"

  If Len(StringVal) > 0 Then
    With New ADODB.Stream
      .Mode = adModeReadWrite
      .Type = adTypeText
      .Charset = "UTF-8"
      .Open
      .WriteText StringVal
      .Position = 0
      .Type = adTypeBinary
      .Position = 3 ' skip BOM
      bytes = .Read
    End With

    ReDim result(UBound(bytes)) As String

    For i = UBound(bytes) To 0 Step -1
      b = bytes(i)
      Select Case b
        Case 97 To 122, 65 To 90, 48 To 57, 45, 46, 95, 126
          result(i) = Chr(b)
        Case 32
          result(i) = space
        Case 0 To 15
          result(i) = "%0" & Hex(b)
        Case Else
          result(i) = "%" & Hex(b)
      End Select
    Next i

    URLEncode = Join(result, "")
  End If
End Function

The URLEncode() function is from this post URLEncode. (I tried to use JScriptControl but It does not work probably because I have Office 64-bit).

This code runs without errors but when I look to the content of htmlPage it is almost empty. I think the problem is that the request I am sending is wrong but I am not able to correct it, can you help me?

Community
  • 1
  • 1
MeSS83
  • 349
  • 2
  • 7
  • 20
  • xmlhttp doesn't render the page - it just has the response from the server for that particular URL (and doesn't include any linked resources such as scripts, css, images, etc). So there is no button to click (or script to execute). You need to load the page into a host object which can implement the full functionality - I don't know if `htmlDocument` will do that: that's one thing to try... – Tim Williams Nov 04 '15 at 18:49
  • thanks @TimWilliams for this answer, at least I know that what I was trying to do can't work. I am going to try your idea and I will let you now – MeSS83 Nov 04 '15 at 19:01
  • @MeSS83 could you please provide the actual link to the webpage and describe the changes of the content after button clicked? – omegastripes Nov 04 '15 at 20:03
  • Hi @TimWilliams I tried your suggestion. The `htmlDocument` support the script execution so I tried to add the command `htmlPage.parentWindow.execScript strScript, "jscript"` at the end of the subroutine but I am getting a runtime error – MeSS83 Nov 04 '15 at 20:07
  • @omegastripes the effect after button clicked is to update the table in the page with the required data. The link is this one [link](https://www.sisal.it/scommesse-matchpoint/). I did not put in the question because this site is only accessible from Italy (at least I believe) – MeSS83 Nov 04 '15 at 20:22
  • Is the page in your browser looks like at [this screenshot](http://i.stack.imgur.com/HeXUb.jpg)? And what is the button you click? – omegastripes Nov 04 '15 at 20:37
  • @omegastripes yes it is the same page, I click the option "ITA Serie A" under "Calcio" (the first choice on the right under the search tab). Normally "Calcio" is open by default for me but it is not important. The script associated to the "ITA Serie A" option is `getAlberaturaAntepostManager().clickManifestazione(1, 21)` – MeSS83 Nov 04 '15 at 20:49
  • Could you elaborate what your final goal is? I suppose there may be another approach to achieve, excluding browser click "emulation". It depends on what data are you intent to retrieve from the generated table, and target categories (calcio, basket ..). For example, after click "ITA Serie A" I see XHR POST request to https://www.sisal.it/scommesse-matchpoint/palinsesto and [JSON response](http://pastebin.com/NzY8iUP8), which includes the most of the data presented in the [generated table](http://i.stack.imgur.com/XFKGf.png). – omegastripes Nov 04 '15 at 21:29
  • @omegastripes the goal is to retrieve the data from the generated table (the one you have) and insert it in a spreadsheet. In particular I would like to retrieve the data for the 1 X 2 columns. I do not know if this answer you question – MeSS83 Nov 04 '15 at 21:42
  • I would try to get that data via XHR POST: parse JSON and retrieve values to the sheet (that isn't hard, see [link](http://stackoverflow.com/a/33484763/2165759), [link](http://stackoverflow.com/a/32429348/2165759)). It's necessary to find out how to get appropriate XHR parameters and headers. That may be done inspecting the web page XHRs after clicking with e. g. Chrome developer tools Network tab. – omegastripes Nov 04 '15 at 22:15
  • thanks @omegastripes tomorrow I will try and let you know – MeSS83 Nov 04 '15 at 22:20
  • What do you need is headless browser that can be controlled via vba – Aminadav Glickshtein Nov 05 '15 at 11:16
  • @MeSS83, `htmlDocument` not need, parse JSON with `Sub ParseJson()` by [the link](http://stackoverflow.com/a/30494373/2165759). First of all, examine XHR made by browser to find out the appropriate parameters you have to send. That could be done in Chrome: open your page, press F12 to open developer tools window, click `ITA Serie A`, open Network tab, click XHR marked (1) at [my screenshot](http://i.stack.imgur.com/lPFaI.png), examine parameters (2) and header (3). JSON response the data should be retrieved from is on Response tab (use http://jsbeautifier.org/ to make it readable). – omegastripes Nov 05 '15 at 17:27
  • thanks @omegastripes for your help, the steps (1)(2)(3) of your screenshot are what I have done (or at least I believe :) ) in the code posted in the updated question. I am not sure if I am missing something in `strFormData` because i do not full understand how it is created in your [link](http://stackoverflow.com/questions/33462100/how-can-i-automate-save-as-dialog-box-in-ie11-using-vba/33484763#33484763). Unfortunately I cannot use your `Sub ParseJson() ` because `ScriptControl` is not working (i think because i use 64-bit Office that is why i tried to use `htmlDOcument` – MeSS83 Nov 05 '15 at 17:55
  • `strFormData` is used there to send Form data, but for this case I suppose there is no Form data send, all necessary parameters are being sent withing `Request URL` (2). So, target URL and all parameters you have prepared should be joined together (see browser's XHR as example), then make POST XHR passing that string as URL (like in GET XHR). Concerning JSON parsing, read that post carefully, especially update - `Sub ParseJson()` uses RegEx's but not ScriptControl. – omegastripes Nov 05 '15 at 18:12

1 Answers1

2

Consider the below example:

Option Explicit

Sub TestDownload()

    Dim strParams As String
    Dim strURL As String
    Dim strJsonString As String
    Dim varJson As Variant
    Dim strState As String
    Dim arrScommessaList() As Variant
    Dim varScommessa As Variant

    strParams = Join(Array( _
        "p_p_id=ScommesseAntepostPalinsesto_WAR_scommesseportlet", _
        "p_p_lifecycle=2", _
        "p_p_state=normal", _
        "p_p_resource_id=dettagliManifestazione", _
        "p_p_cacheability=cacheLevelPage", _
        "_ScommesseAntepostPalinsesto_WAR_scommesseportlet_codDisc=1", _
        "_ScommesseAntepostPalinsesto_WAR_scommesseportlet_codMan=21", _
        "_ScommesseAntepostPalinsesto_WAR_scommesseportlet_codClusterScomm=", _
        "_ScommesseAntepostPalinsesto_WAR_scommesseportlet_filtro=0" _
    ), "&")
    strURL = "http://www.sisal.it/scommesse-matchpoint/palinsesto?" & strParams

    With CreateObject("Microsoft.XMLHTTP")
        .Open "GET", strURL, False
        .Send
        strJsonString = .ResponseText
    End With

    ParseJson strJsonString, varJson, strState

    arrScommessaList = varJson("scommessaList")
    For Each varScommessa In arrScommessaList
        Debug.Print varScommessa("descrizioneAvvenimento")
        Debug.Print vbTab & _
        varScommessa("esitoList")(0)("formattedQuota") & vbTab & _
        varScommessa("esitoList")(1)("formattedQuota") & vbTab & _
        varScommessa("esitoList")(2)("formattedQuota")
    Next

End Sub

Sub ParseJson(ByVal strContent As String, varJson As Variant, strState As String)
    ' strContent - source JSON string
    ' varJson - created object or array to be returned as result
    ' strState - Object|Array|Error depending on processing to be returned as state
    Dim objTokens As Object
    Dim lngTokenId As Long
    Dim objRegEx As Object
    Dim bMatched As Boolean

    Set objTokens = CreateObject("Scripting.Dictionary")
    lngTokenId = 0
    Set objRegEx = CreateObject("VBScript.RegExp")
    With objRegEx
        ' specification http://www.json.org/
        .Global = True
        .MultiLine = True
        .IgnoreCase = True
        .Pattern = """(?:\\""|[^""])*""(?=\s*(?:,|\:|\]|\}))"
        Tokenize objTokens, objRegEx, strContent, lngTokenId, bMatched, "str"
        .Pattern = "(?:[+-])?(?:\d+\.\d*|\.\d+|\d+)e(?:[+-])?\d+(?=\s*(?:,|\]|\}))"
        Tokenize objTokens, objRegEx, strContent, lngTokenId, bMatched, "num"
        .Pattern = "(?:[+-])?(?:\d+\.\d*|\.\d+|\d+)(?=\s*(?:,|\]|\}))"
        Tokenize objTokens, objRegEx, strContent, lngTokenId, bMatched, "num"
        .Pattern = "\b(?:true|false|null)(?=\s*(?:,|\]|\}))"
        Tokenize objTokens, objRegEx, strContent, lngTokenId, bMatched, "cst"
        .Pattern = "\b[A-Za-z_]\w*(?=\s*\:)" ' unspecified name without quotes
        Tokenize objTokens, objRegEx, strContent, lngTokenId, bMatched, "nam"
        .Pattern = "\s"
        strContent = .Replace(strContent, "")
        .MultiLine = False
        Do
            bMatched = False
            .Pattern = "<\d+(?:str|nam)>\:<\d+(?:str|num|obj|arr|cst)>"
            Tokenize objTokens, objRegEx, strContent, lngTokenId, bMatched, "prp"
            .Pattern = "\{(?:<\d+prp>(?:,<\d+prp>)*)?\}"
            Tokenize objTokens, objRegEx, strContent, lngTokenId, bMatched, "obj"
            .Pattern = "\[(?:<\d+(?:str|num|obj|arr|cst)>(?:,<\d+(?:str|num|obj|arr|cst)>)*)?\]"
            Tokenize objTokens, objRegEx, strContent, lngTokenId, bMatched, "arr"
        Loop While bMatched
        .Pattern = "^<\d+(?:obj|arr)>$" ' unspecified top level array
        If Not (.test(strContent) And objTokens.Exists(strContent)) Then
            varJson = Null
            strState = "Error"
        Else
            Retrieve objTokens, objRegEx, strContent, varJson
            strState = IIf(IsObject(varJson), "Object", "Array")
        End If
    End With
End Sub

Sub Tokenize(objTokens, objRegEx, strContent, lngTokenId, bMatched, strType)
    Dim strKey As String
    Dim strRes As String
    Dim lngCopyIndex As Long
    Dim objMatch As Object

    strRes = ""
    lngCopyIndex = 1
    With objRegEx
        For Each objMatch In .Execute(strContent)
            strKey = "<" & lngTokenId & strType & ">"
            bMatched = True
            With objMatch
                objTokens(strKey) = .Value
                strRes = strRes & Mid(strContent, lngCopyIndex, .FirstIndex - lngCopyIndex + 1) & strKey
                lngCopyIndex = .FirstIndex + .Length + 1
            End With
            lngTokenId = lngTokenId + 1
        Next
        strContent = strRes & Mid(strContent, lngCopyIndex, Len(strContent) - lngCopyIndex + 1)
    End With
End Sub

Sub Retrieve(objTokens, objRegEx, strTokenKey, varTransfer)
    Dim strContent As String
    Dim strType As String
    Dim objMatches As Object
    Dim objMatch As Object
    Dim strName As String
    Dim varValue As Variant
    Dim objArrayElts As Object

    strType = Left(Right(strTokenKey, 4), 3)
    strContent = objTokens(strTokenKey)
    With objRegEx
        .Global = True
        Select Case strType
            Case "obj"
                .Pattern = "<\d+\w{3}>"
                Set objMatches = .Execute(strContent)
                Set varTransfer = CreateObject("Scripting.Dictionary")
                For Each objMatch In objMatches
                    Retrieve objTokens, objRegEx, objMatch.Value, varTransfer
                Next
            Case "prp"
                .Pattern = "<\d+\w{3}>"
                Set objMatches = .Execute(strContent)

                Retrieve objTokens, objRegEx, objMatches(0).Value, strName
                Retrieve objTokens, objRegEx, objMatches(1).Value, varValue
                If IsObject(varValue) Then
                    Set varTransfer(strName) = varValue
                Else
                    varTransfer(strName) = varValue
                End If
            Case "arr"
                .Pattern = "<\d+\w{3}>"
                Set objMatches = .Execute(strContent)
                Set objArrayElts = CreateObject("Scripting.Dictionary")
                For Each objMatch In objMatches
                    Retrieve objTokens, objRegEx, objMatch.Value, varValue
                    If IsObject(varValue) Then
                        Set objArrayElts(objArrayElts.Count) = varValue
                    Else
                        objArrayElts(objArrayElts.Count) = varValue
                    End If
                    varTransfer = objArrayElts.Items
                Next
            Case "nam"
                varTransfer = strContent
            Case "str"
                varTransfer = Mid(strContent, 2, Len(strContent) - 2)
                varTransfer = Replace(varTransfer, "\""", """")
                varTransfer = Replace(varTransfer, "\\", "\")
                varTransfer = Replace(varTransfer, "\/", "/")
                varTransfer = Replace(varTransfer, "\b", Chr(8))
                varTransfer = Replace(varTransfer, "\f", Chr(12))
                varTransfer = Replace(varTransfer, "\n", vbLf)
                varTransfer = Replace(varTransfer, "\r", vbCr)
                varTransfer = Replace(varTransfer, "\t", vbTab)
                .Global = False
                .Pattern = "\\u[0-9a-fA-F]{4}"
                Do While .test(varTransfer)
                    varTransfer = .Replace(varTransfer, ChrW(("&H" & Right(.Execute(varTransfer)(0).Value, 4)) * 1))
                Loop
            Case "num"
                varTransfer = Evaluate(strContent)
            Case "cst"
                Select Case LCase(strContent)
                    Case "true"
                        varTransfer = True
                    Case "false"
                        varTransfer = False
                    Case "null"
                        varTransfer = Null
                End Select
        End Select
    End With
End Sub

The output is:

output

For actual table on the page:

table

Hope this helps.

omegastripes
  • 12,351
  • 4
  • 45
  • 96