1

I've written a script in vba to scrape the ip address populated upon making a proxied request. I've used proxy (out of list of proxies) within my vba script to test (probably none of them are working at this moment).

However, what I want to achieve is that when a requests is failed the following script will print that error message and keep going for the next requests otherwise it will parse the ip address from that site and keep going until the loops gets exhausted.

My attempt so far (consider the proxyList to be the working ones):

Sub ValidateProxies()
    Dim Http As New ServerXMLHTTP60, elem As Object, S$
    Dim proxyList As Variant, oProxy As Variant

    proxyList = [{"98.163.59.8:8080","134.209.115.223:3128","191.101.233.198:3129","198.177.126.218:80","35.185.201.225:8080"}]

    For Each oProxy In proxyList
        On Error Resume Next
        With Http
            .Open "GET", "https://www.myip.com/", False
            .setRequestHeader "User-Agent", "Mozilla/5.0"
            .setProxy 2, oProxy
            .send
        End With
        On Error GoTo 0

        If Err.Number <> 0 Then
            Debug.Print "Encountered an error"

        Else:
            With New HTMLDocument
                .body.innerHTML = Http.responseText
                Set elem = .querySelector("#ip")
                R = R + 1: Cells(R, 1) = oProxy
                Cells(R, 2) = elem.innerText
            End With
        End If
    Next oProxy
End Sub

How can I make my script print any error when there is one and keep rolling until the loop ends?

robots.txt
  • 96
  • 2
  • 10
  • 36
  • Did you try stepping through the code? – Stupid_Intern Mar 17 '19 at 12:46
  • Yep, I did it just now and could find out that in it's first loop it works just fine but when it loops for the second time then it throws this error `data necessary to complete the operation is not yet available` pointing at this line `.body.innerHTML = Http.responseText`. – robots.txt Mar 17 '19 at 13:03
  • End the sub if you get an error and have an outer loop repeatedly calling this sub and print possible error messages. Make the sub a function instead that can return possible error messages to print –  Mar 17 '19 at 13:08

2 Answers2

2

Here is the example with async requests pool and logging statuses and errors to a worksheet. It uses a proxy list from free-proxy-list.net.

Option Explicit

Sub TestProxy()

    Const PoolCapacity = 50
    Const ReqTimeout = 15

    Dim sResp
    Dim aProxyList
    Dim oMatch
    Dim oWS
    Dim lIndex
    Dim ocPool
    Dim i
    Dim sResult
    Dim oReq

    ' Parsing proxy list from free-proxy-list.net
    With CreateObject("MSXML2.ServerXMLHTTP.6.0")
        .Open "GET", "https://free-proxy-list.net/", True
        .SetRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64)"
        .Send
        Do Until .ReadyState = 4: DoEvents: Loop
        sResp = .ResponseText
    End With
    With CreateObject("VBScript.RegExp")
        .Global = True
        .Pattern = "<td[^>]*>(\d+\.\d+\.\d+\.\d+)<\/td><td[^>]*>(\d+)<\/td>"
        aProxyList = Array()
        For Each oMatch In .Execute(sResp)
            ReDim Preserve aProxyList(UBound(aProxyList) + 1)
            aProxyList(UBound(aProxyList)) = oMatch.SubMatches(0) & ":" & oMatch.SubMatches(1)
        Next
    End With
    ' Proxy checking with api.myip.com requests
    Set oWS = ThisWorkbook.Sheets(1)
    oWS.Cells.Delete
    Set ocPool = New Collection
    lIndex = 0
    Do
        ' Check pool for completed requests
        For i = ocPool.Count To 1 Step -1
            On Error Resume Next
            sResult = ""
            With ocPool(i)(0)
                Select Case True
                    Case .ReadyState < 4
                    Case .Status \ 100 <> 2
                        sResult = "Status " & .Status & " / " & .StatusText
                    Case Else
                        sResult = .ResponseText
                End Select
            End With
            Select Case True
                Case Err.Number <> 0
                    sResult = "Error " & Err.Number & " / " & Err.Description
                Case (Now - ocPool(i)(1)) * 86400 > ReqTimeout
                    sResult = "Timeout"
            End Select
            On Error GoTo 0
            If sResult <> "" Then
                oWS.Cells(ocPool(i)(2), 2).Value = sResult
                ocPool.Remove i
            End If
            DoEvents
        Next
        ' Add new request to pool
        If ocPool.Count < PoolCapacity And lIndex <= UBound(aProxyList) Then
            Set oReq = CreateObject("MSXML2.ServerXMLHTTP.6.0")
            With oWS.Cells(lIndex + 1, 1)
                .Value = aProxyList(lIndex)
                .Select
            End With
            With oReq
                .Open "GET", "https://api.myip.com/", True
                .SetProxy 2, aProxyList(lIndex)
                .SetRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64)"
                .Send
            End With
            ocPool.Add Array( _
                oReq, _
                Now, _
                lIndex + 1 _
            )
            lIndex = lIndex + 1
            DoEvents
        End If
    Loop While ocPool.Count > 0
    MsgBox "Completed"

End Sub
omegastripes
  • 12,351
  • 4
  • 45
  • 96
1

This will print all errors encountered and you should tailor by err.Number

Option Explicit
Public Sub ValidateProxies()
    Dim http As New ServerXMLHTTP60, elem As Object, S$
    Dim proxyList As Variant, oProxy As Variant, r As Long
    Dim html As HTMLDocument
    Set html = New HTMLDocument
    proxyList = [{"98.163.59.8:8080","134.209.115.223:3128","191.101.233.198:3129","198.177.126.218:80","35.185.201.225:8080"}]

    For Each oProxy In proxyList
        On Error GoTo errhand:
        With http
            .Open "GET", "https://www.myip.com/", False
            .setRequestHeader "User-Agent", "Mozilla/5.0"
            .SetProxy 2, oProxy
            .send
            With html
                .body.innerHTML = http.responseText
                Set elem = .querySelector("#ip")
                r = r + 1: ActiveSheet.Cells(r, 1) = oProxy
                ActiveSheet.Cells(r, 2) = elem.innerText
            End With
        End With
    Next oProxy
    Exit Sub

errhand:
    If Err.Number <> 0 Then
        Debug.Print "Encountered an error " & Err.Description, oProxy
        Err.Clear
        Resume Next
    End If

End Sub
QHarr
  • 83,427
  • 12
  • 54
  • 101
  • Oopsy! Typo. Try now. – QHarr Mar 17 '19 at 16:47
  • odd as r is still incremented even if there is an error – QHarr Mar 17 '19 at 17:03
  • you can try ActiveSheet.Cells(r, 2) = elem.innerText: Set elem = Nothing – QHarr Mar 17 '19 at 17:03
  • Sadly I can't do much to debug. I would try to understand why it is overwriting if r is still being incremented. Does the same thing happen with set elem = nothing: Set elem = .querySelector("#ip") With that on separate links not with : – QHarr Mar 17 '19 at 17:08
  • also put r = r + 1: ActiveSheet.Cells(r, 1) = oProxy on separate lines and oProxy = vbNullString on line after – QHarr Mar 17 '19 at 17:14
  • It is the better way to go as On Error Resume Next simply says ignore error. You would need to add in repeated lines of If err.Number <> 0 and debug.print err.message and err.Clear inside code otherwise. – QHarr Mar 17 '19 at 17:16
  • you could always hold off accepting and ask the [ducks](https://chat.stackexchange.com/rooms/14929/vba-rubberducking) to have a look. @mathieuguindon in particular has done some nice boilerplate stuff with err handling. – QHarr Mar 17 '19 at 17:19
  • that sounds interesting. I would be interested in knowing more as I could use it for debugging. – QHarr Mar 17 '19 at 17:20
  • Yep, mathieu guindon once suggested me to use that add-in to do the indenting and I learnt the indenting from there but it makes the application way slower so I got rid of it. – robots.txt Mar 17 '19 at 17:23
  • I love it. But I have seen some answers of his elsewhere on site where he shows boiler plate stuff that is tidy for error handling. They have always been very helpful when I have posted a question link and asked for help. – QHarr Mar 17 '19 at 17:24