1

I'm trying to create a script in vba using ServerXMLHTTP60 to parse the title of the first post from some identical links. My main goal here is to make the script asynchronous along with setting a highest time up to which the script will try before going for the next url.

However, the macro that I've created always goes for the next url when there is a timeout without being able to scrape the titles from the links.

Sub FetchContentWithinSpecificTime()
    Dim oHttp As New ServerXMLHTTP60, HTML As New HTMLDocument
    Dim URL As Variant, Urllist As Variant, t As Date, sResp As Boolean

    Urllist = Array( _
        "https://stackoverflow.com/questions/tagged/web-scraping?tab=newest&page=1", _
        "https://stackoverflow.com/questions/tagged/web-scraping?tab=newest&page=2", _
        "https://stackoverflow.com/questions/tagged/web-scraping?tab=newest&page=3", _
        "https://stackoverflow.com/questions/tagged/web-scraping?tab=newest&page=4", _
        "https://stackoverflow.com/questions/tagged/web-scraping?tab=newest&page=5" _
    )

    For Each URL In Urllist
        Debug.Print "trying with: " & URL
        With oHttp
            .Open "GET", URL, True
            .setRequestHeader "User-Agent", "Mozilla/5.0"
            .setTimeouts 5000, 5000, 15000, 15000
            .send
            t = Now + TimeValue("00:00:10")
            sResp = False

            On Error Resume Next
            Do
                If .readyState = 4 Then sResp = True: Exit Do
                If Now > t Then sResp = False: Exit Do
                DoEvents
            Loop
            On Error GoTo 0

            If sResp Then
                HTML.body.innerHTML = .responseText
                Debug.Print HTML.querySelector(".question-hyperlink").innerText
            Else:
                Debug.Print "failed with: " & URL
            End If
        End With
    Next URL
End Sub

How can I make a script wait for a certain time for the titles to parse before going for the next url?

MITHU
  • 113
  • 3
  • 12
  • 41
  • Hmm, would you be ok sending all the requests up front then parsing as results are returned? IIRC, this is how async works, it doesn't wait for a response from the server before sending the next request. If you do, maybe something I did awhile back https://codereview.stackexchange.com/a/221316/108307 would help. You could add a max timeout in the while loop in the example. – Ryan Wildry Apr 20 '20 at 11:10

1 Answers1

1

I don't know why those SO links take to long to respond but I tried with different urls and the following solution appears to be working in the right way. The credit for the rectified portion goes to the provider of this solution.

Sub FetchContentWithinSpecificTime()
    Dim oHttp As New ServerXMLHTTP60, HTML As New HTMLDocument
    Dim URL As Variant, Urllist As Variant, t As Date
    Dim sPrice$, sResp As Boolean

    Urllist = Array( _
        "https://finance.yahoo.com/quote/NZDUSD=X?p=NZDUSD=X", _
        "https://finance.yahoo.com/quote/FB?p=FB", _
        "https://finance.yahoo.com/quote/AAPL?p=AAPL", _
        "https://finance.yahoo.com/quote/IBM?p=IBM", _
        "https://finance.yahoo.com/quote/UCO?p=UCO" _
    )

    For Each URL In Urllist
        Debug.Print "trying with: " & URL
        With oHttp
            .Open "GET", URL, True
            .setRequestHeader "User-Agent", "Mozilla/5.0"
            .send
            t = Now + TimeValue("00:00:10")
            sResp = False

            Do While .readyState < 4
                If .readyState = 4 Then Exit Do
                sResp = (Now > t) Or (Err.Number <> 0)
                If sResp Then Exit Do
                DoEvents
            Loop

            If Not sResp Then
                HTML.body.innerHTML = .responseText
                sPrice = HTML.querySelector(".Mb\(-4px\)").innerText
                Debug.Print sPrice
            Else:
                Debug.Print "failed with: " & URL
            End If
        End With
    Next URL
End Sub
MITHU
  • 113
  • 3
  • 12
  • 41