0

I had a question about how to retrieve the contents of a final URL in Microsoft excel, using their Visual BASIC macros.

Essentially, I have a list (ListA) full of URLs. I have code written to trawl every URL on List A one by one and retrieve the data I need, putting it into Excel.

However, a certain percentage of the URLs redirect to 404 pages. I do not have any way of knowing which ones these will be in advance currently, and am trying to write a script that will simply:

1.) Access the URL in ListA

2.) Copy the destination URL that it redirects to

3.) Paste that URL into the cell directly to the right of the original URL

That way I can see what the final URLs are, and if any go to a 404 page, I can delete them from the list before attempting to trawl it for the info I need.

I have had no luck in doing so thus far, and every tutorial I can find online seems to feature code that will not work in Microsoft Excel's limited environment. Does anyone have any idea where I should start?

If it helps, here is the code I have written to trawl the webpage for data:

For i = 1 To 500
    ThisURL = "URL;" & WSD.Cells(i, 2)
    ThisParcel = "P" & WSD.Cells(i, 1)
    Set WSW = Worksheets.Add(After:=Worksheets(Worksheets.Count))
    WSW.Name = ThisParcel
    WSW.Select

    ' Do a web query here
With ActiveSheet.QueryTables.Add(Connection:= ThisURL, Destination:=Range("$A$1"))
    .Name = "Query" & i
    .FieldNames = True
    .RowNumbers = False
    .FillAdjacentFormulas = False
    .PreserveFormatting = True
    .RefreshOnFileOpen = False
    .BackgroundQuery = True
    .RefreshStyle = xlInsertDeleteCells
    .SavePassword = False
    .SaveData = True
    .AdjustColumnWidth = True
    .RefreshPeriod = 0
    .WebSelectionType = xlEntirePage
    .WebFormatting = xlWebFormattingNone
    .WebPreFormattedTextToColumns = True
    .WebConsecutiveDelimitersAsOne = True
    .WebSingleBlockTextImport = False
    .WebDisableDateRecognition = False
    .WebDisableRedirections = False
    .Refresh BackgroundQuery:=False
End With

Next i
Derek S.
  • 1
  • 1
  • this is not wrong but certainly not a good idea because as u add connections, when there are too many connection, it slows down ur workbook until it is creepy crawly. apparently deleting the connections don't speed it back up. I'd suggest u take a look at creating an `InternetExplorer` method here http://stackoverflow.com/questions/27066963/scraping-data-from-website-using-vba – Rosetta Sep 16 '16 at 04:47
  • Something like the answers here would be cleaner I think: http://stackoverflow.com/questions/36064315/excel-vba-script-to-find-404-errors-in-a-list-of-urls – Tim Williams Sep 16 '16 at 05:37
  • Tim, that is absolutely perfect! I ran it, and it functions perfectly. Than you so much! – Derek S. Sep 16 '16 at 23:03

0 Answers0