3

Is there a way to check the syntax of a URL via visual basic? Here is my code below. I need a way to just check the syntax and to be sure it is correct (i.e has http, .com or .net or .edu). I need to check the format to be sure the url's are typical of standard url format. Can this be done?

Public PageSource As String
Public httpRequest As Object

Function GetURLStatus(ByVal URL As String, Optional AllowRedirects As Boolean)

    Const WinHttpRequestOption_EnableRedirects = 6


        If httpRequest Is Nothing Then
            On Error Resume Next
                Set httpRequest = CreateObject("WinHttp.WinHttpRequest.5.1")
                If httpRequest Is Nothing Then
                    Set httpRequest = CreateObject("WinHttp.WinHttpRequest.5")
                End If
            Err.Clear
            On Error GoTo 0
        End If


          httpRequest.Option(WinHttpRequestOption_EnableRedirects) = AllowRedirects



          If InStr(1, URL, "://") = 0 Then
             URL = "http://" & URL
          End If


               On Error Resume Next
                  httpRequest.Open "GET", URL, False
                  If Err.Number <> 0 Then
                   ' Handle connection errors
                     GetURLStatus = Err.Description
                     Err.Clear
                     Exit Function
                  End If
               On Error GoTo 0


               On Error Resume Next
                  httpRequest.Send
                  httpRequest.WaitForResponse
                  If Err.Number <> 0 Then

                     PageSource = "Error"
                     GetURLStatus = Err.Description
                     Err.Clear
                  Else

                     GetURLStatus = httpRequest.Status & " - " & httpRequest.StatusText

                     PageSource = httpRequest.ResponseText
                  End If
               On Error GoTo 0

End Function
  • Maybe check out the [Url class](https://msdn.microsoft.com/en-us/library/system.uri(v=vs.110).aspx?f=255&MSPPError=-2147217396&cs-save-lang=1&cs-lang=vb#code-snippet-1) – Mike Christensen Jun 02 '16 at 21:37
  • Consider setting up a generalised regular expression? – Dave Jun 03 '16 at 00:19
  • @MikeChristensen is it exposed to COM seen as though the OP is after a VBScript solution? – user692942 Jun 03 '16 at 07:01
  • I think you may have mistagged this. That code doesn't look like VBScript to me. Are you looking for VBScript, VBA, VB6, or VB.NET? –  Jun 03 '16 at 16:01

1 Answers1

3

Three approaches come to mind: regular expressions, using XMLHTTP, and using a third-party library.

If you're OK with using another language, you could write your own ActiveX control. VB.Net, for example, has built-in classes that make validating a URL pretty easy. I'm assuming you're looking for a VB Script only answer, so I won't even try to cover that one.

First, you probably want to figure out what types of URL you want to cover. According to the spec, all of these are valid URLs:

ftp://ftp.is.co.za/rfc/rfc1808.txt
http://www.ietf.org/rfc/rfc2396.txt
ldap://[2001:db8::7]/c=GB?objectClass?one
mailto:John.Doe@example.com
news:comp.infosystems.www.servers.unix
tel:+1-816-555-1212
telnet://192.0.2.16:80/
urn:oasis:names:specification:docbook:dtd:xml:4.1.2

The narrower your scope, the less complicated your solution needs to be. If you need to cover all possible types, I would look at a third-party library. A quick google found this. This is not an endorsement. I've never used this library and I'm sure that there are many other wonderful libraries out there.

You can try the regular expression method, but it's filled with edge cases that might drive you crazy. Again, if you can narrow your scope you'll be more successful. Here's a detailed discussion on using regular expressions to validate URLs. This is also where I shamelessly stole the regular expression in the example below :).

My VB is a little rusty, but here's an example of the regex approach...

Wscript.Echo IsUrlValidRegex("http://www.stackoverflow.com")
Wscript.Echo IsUrlValidRegex("this is not a url")
Wscript.Echo IsUrlValidRegex("mailto:John.Doe@example.com")

Function IsUrlValidRegex(url)
    Set oRegex = new regexp

    oRegex.Pattern = "((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+@)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+@)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%@.\w_]*)#?(?:[\w]*))?)"
    oRegex.IgnoreCase = true


    if oRegex.Test(url) then
        IsUrlValidRegex = true
    else
        IsUrlValidRegex = false
    end if    
End Function

Another possible approach is to try to hit the URL to determine if it's valid or not. This sounds good at first, but it only works with HTTP/S and it works best when hitting a live server. I'm not sure that I would recommend this method, but here's how it might work...

Wscript.Echo IsUrlValidHttp("http://www.stackoverflow.com")
Wscript.Echo IsUrlValidHttp("http://not.arealwebsite.com/")
Wscript.Echo IsUrlValidHttp("this is not a url")
Wscript.Echo IsUrlValidHttp("mailto:John.Doe@example.com")

Function IsUrlValidHttp(sUrl)
    On Error Resume Next

    Dim oXMLHTTP
    Set oXMLHTTP = CreateObject("MSXML2.ServerXMLHTTP")

    oXMLHTTP.Open "GET", sUrl, False
    oXMLHTTP.Send

    If Err = 0 Then
        '  valid HTTP URL, valid server 
        'If oXMLHTTP.Status = 200 Then
            IsUrlValidHttp = true    
        'End If
    ElseIf Err = -2147012889 Then
        ' valid HTTP URL, invalid server
        IsUrlValidHttp = true
    ElseIf Err = -2147467259 Then
        ' not a valid http URL
        IsUrlValidHttp = false        
    End If
 End Function

I'd try the regular expression method if you're OK with the possible edge cases. Otherwise, I'd look at a third-party library.

Community
  • 1
  • 1
sarme
  • 1,337
  • 12
  • 19