4

I've written a script in vba using ServerXMLHTTP requests in order to be able to use proxy along with setting timeout parameter within it. When I run the script, it appears to be working but the problem is - it gets stuck after using the first proxy. I wish this to be running until there is no proxies left to be used. I defined this line While .readyState < 4: DoEvents: Wend only to let not the script freeze. Whether the proxies work or not the script should go on, right?

This is what I've tried:

Sub MakeProxiedRequests()
    Dim Http As New ServerXMLHTTP60, Html As New HTMLDocument
    Dim elem As Object, proxyList As Variant, oProxy As Variant

    proxyList = Array( _
        "191.96.42.184:3129", _
        "138.197.108.5:3128", _
        "35.245.145.147:8080", _
        "173.46.67.172:58517", _
        "191.96.42.82:3129", _
        "157.55.201.224:8080", _
        "67.205.172.239:3128", _
        "191.96.42.106:3129" _
    )

    For Each oProxy In proxyList

        Debug.Print "trying with: " & oProxy

        With Http
            .Open "GET", "https://stackoverflow.com/questions/tagged/web-scraping", True
            .setRequestHeader "User-Agent", "Mozilla/5.0"
            .setProxy 2, oProxy
            .setTimeouts 600000, 600000, 15000, 15000 'I don't know the ideal timeout parameters
            On Error Resume Next
            .send
            While .readyState < 4: DoEvents: Wend 'to let not freeze the script

            Html.body.innerHTML = .responseText
            Set elem = Html.querySelectorAll(".summary .question-hyperlink")
            On Error GoTo 0
        End With

        If elem.Length > 0 Then
            Debug.Print elem(0).innerText
        Else:
            Debug.Print "failed with: " & oProxy
        End If

    Next oProxy
End Sub
  • Note: The script will always produce the same result. However, my intention is to keep the script running until all the proxies have been used.

How can I let my script run until all the proxies have been exhausted?

robots.txt
  • 96
  • 2
  • 10
  • 36
  • BTW, It is possible to query StackOverflow without webscraping, https://www.hardworkingnerd.com/how-to-query-the-stackoverflow-database/ – S Meaden Aug 21 '19 at 17:58
  • The site link used within the above script is just a placeholder. The question is about using rotation of proxies within vba @S Meaden. Thanks. – robots.txt Aug 21 '19 at 18:51

1 Answers1

2

The possible way is controlling request overall elapsed time and limiting it. Any run-time errors are being checked also.

Sub MakeProxiedRequests()

    Const Timeout = "0:00:15"

    Dim oHttp As New ServerXMLHTTP60
    Dim oHtml As New HTMLDocument
    Dim oElem As Object
    Dim aProxyList
    Dim sProxy
    Dim t As Date
    Dim bFailed As Boolean

    aProxyList = Array( _
        "191.96.42.184:3129", _
        "138.197.108.5:3128", _
        "35.245.145.147:8080", _
        "173.46.67.172:58517", _
        "191.96.42.82:3129", _
        "157.55.201.224:8080", _
        "67.205.172.239:3128", _
        "191.96.42.106:3129" _
    )
    For Each sProxy In aProxyList
        Debug.Print "Trying with: " & sProxy
        With oHttp
            .Open "GET", "https://stackoverflow.com/questions/tagged/web-scraping", True
            .setRequestHeader "User-Agent", "Mozilla/5.0"
            .setProxy 2, sProxy
            .setTimeouts 60000, 60000, 60000, 60000
            .send
            t = Now() + TimeValue(Timeout)
            bFailed = False
            On Error Resume Next
            Do
                If .readyState = 4 Then Exit Do
                bFailed = (Now() > t) Or (Err.Number <> 0)
                If bFailed Then Exit Do
                DoEvents
            Loop
            On Error GoTo 0
            If Not bFailed Then
                oHtml.body.innerHTML = .responseText
                Set oElem = oHtml.querySelectorAll(".summary .question-hyperlink")
                bFailed = oElem.Length = 0
            End If
        End With
        If Not bFailed Then
            Debug.Print oElem(0).innerText
        Else
            Debug.Print "Failed with: " & sProxy
        End If
    Next

End Sub
omegastripes
  • 12,351
  • 4
  • 45
  • 96