I recently developed a web scraper that uses VBA and Selenium. The purpose of the macro is to go through all of the links in every row in the first column in Sheet1, scrape the web content, send the results to the same cells in Sheet3 (i.e. the links that it opens in cell (9.1) in Sheet1 will be scraped and the results sent to cell (9.1) in Sheet3) and then go through the links in the second column and so on.
I created it by having two for loops (one to go through all the columns and then an inner for loop to go through all of the rows in each column). The scraper gets the right content and the loop for the rows works fine. However, the macro stops after processing all of the links in the first column instead of going to Column B to repeat the process.
I am assuming that there is something wrong with my outer for loop for the columns. Here is my code:
Sub scraping_web()
Dim chrm As Selenium.ChromeDriver
Dim row_no, col_no As Integer
Dim visitpage
Dim attributionlink
Dim fullHTMLlink
Dim alttag
row_no = 2
Application.ScreenUpdating = False
Set chrm = New Selenium.ChromeDriver
chrm.Start
For j = 1 To Sheet1.Cells(Columns.Count, 1).End(xlUp).Column
For i = 3 To Sheet1.Cells(Rows.Count, 1).End(xlUp).Row
visitpage = Sheet1.Cells(i, j).Value
chrm.Get (visitpage)
chrm.FindElementById ("attribution-html")
chrm.FindElementByXPath("/html/body/div/div/div/div[2]/main/section[2]/div/div[2]/div/div[1]/button[2]").Click
chrm.FindElementByXPath("/html/body/div/div/div/div[2]/main/figure/img").Text
Application.Wait Now + TimeValue("00:00:08")
attributionlink = chrm.FindElementByXPath("/html/body/div/div/div/div[2]/main/figure/img").Attribute("src")
alttag = chrm.FindElementByXPath("/html/body/div/div/div/div[2]/main/figure/img").Attribute("alt")
fullHTMLlink = "<img src=""" & attributionlink & """" & " " & "alt=" & """" & "This is a Creative Commons image with the title " & alttag & "" & """" & ">"
Sheet3.Cells(i, j).Value = fullHTMLlink & chrm.FindElementById("attribution-html").Text
Next
Next
End Sub
Please let me know if you see anything wrong with my outer for loop or if you have another solution that works better.
Thanks!
Edit: MY title initially said "first row" but I meant to say column