0

I have the following code, and it's not executing due to an error message that reads "Method 'Execute' of object 'IRegExp2 failed"

Dim wrksht As Worksheet
Dim pg As Page
Dim regEx As New VBScript_RegExp_55.RegExp

For Each wrksht In Worksheets
    For Each pg In wrksht.PageSetup.pages
        With regEx
            .IgnoreCase = True
            .Pattern = "[(ftp|FTP|http|HTTP|s|S)]{3,5}*[\:/]{3}*[a-zA-Z0-9\s]{2,}[\.]*[.A-Za-z0-9\s]{2,}"
        End With

        If regEx.Execute(pg.CenterFooter.Text) Then
            pg.CenterFooter.Text = regEx.Replace(pg.CenterFooter.Text, "")
        End If
    Next pg
Next wrksht

I'm looking to replace the beginning part of URLs (www.sampleurl.com/subfolder/testfile.pdf) and keeping only the trailing part (www.sampleurl.com /subfolder/testfile.pdf, minus the space after the .com part)

Chrismas007
  • 6,085
  • 4
  • 24
  • 47
Just Rudy
  • 700
  • 11
  • 28
  • 1
    Have you tried simply replacing "*.com/" with ""? Unsure if you've got a set column/row with that info, but you could loop through pages and possibly avoid the regexp2 error. – Cyril Jan 18 '17 at 17:57
  • Thanks, @Cyril. I could try that, but that would only take care of "*.com/", and I'm expecting other domains ending in ".net", ".org", etc. I need to make this general enough to remove all url roots, and leave the trailing 'path'. – Just Rudy Jan 18 '17 at 18:46

2 Answers2

1

Slightly more generic than replacing the .com (per comments)... Hopefully this gets you where you need to be:

Dim LR As Long
LR = Cells(ActiveSheet.Rows.Count, 1).End(xlUp).Row

For i = 1 To LR

    Cells(i, 1) = Replace(Cells(i, 1), "https://", "", , 1)
    Cells(i, 1) = Replace(Cells(i, 1), "http://", "", , 1)
    Cells(i, 1) = Replace(Cells(i, 1), "/", "(error)", , 1)

    Next i

Range(Cells(1, 1), Cells(LR, 1)).Replace What:="*(error)", Replacement:=""

My assumption is there is a column with the addresses in them.

Cyril
  • 6,448
  • 1
  • 18
  • 31
0

I copied the regular expression from https://stackoverflow.com/a/6041965/5947891, and the following is a final version that worked for me. Basically, this will scan through all sheets in the workbook, check for url pattern, truncate the URL base (http ://www.sample.com/subfolder/subfile.png) and leave the trailing part(s) (e.g., http ://www.sample.com**/subfolder/subfile.png**). I added spaces after http in the URL to invalidate the href.

Sub FixFooter()
    Dim oSection As Word.Section
    Dim oRange As Word.Range
    Dim var
    Dim wrksht As Worksheet
    Dim pg As Page

    Dim regEx As New RegExp 'New VBScript_RegExp_55.RegExp
    Dim strPattern As String

    strPattern1 = "(http|ftp|https){0,1}(\://){0,1}([\w_-]+(?:(?:\.[\w_-]+)+))/"

    For Each wrksht In Worksheets
        Set pg = wrksht.PageSetup.pages(1)

        With regEx
            .IgnoreCase = True
            .Global = True
            .MultiLine = True
            .Pattern = strPattern1

        End With

        If regEx.Test(pg.CenterFooter.Text) Then
            pg.CenterFooter.Text = "/" & regEx.Replace(pg.CenterFooter.Text, "")
        End If
    Next wrksht
End Sub
Community
  • 1
  • 1
Just Rudy
  • 700
  • 11
  • 28