I have written a VBA macro to count the (approximate) number of images returned for a Google search of a specific term. By approximate I mean that the program should count the number of images returned, scroll down to load some more (where applicable) up to a max of 400 images counted. Here's the (simplified) code:
Sub GoogleCount ()
'''
'[Code to construct the URL ('fullUrl')]
'''
Set objIE = New InternetExplorer
objIE.navigate fullUrl
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
Set currPage = objIE.document
'Count images returned
newNum = currPage.getElementById("rg_s").getElementsByTagName("IMG").Length
'Scroll down until count = 400 (max) or no change in value
Do While newNum >= 100 And newNum < 400 And newNum <> oldNum
oldNum = newNum
currPage.parentWindow.scrollBy 0, 100000
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
newNum = currPage.getElementById("rg_s").getElementsByTagName("IMG").Length
Loop
'''
'[Code to paste the value of newNum into my workbook, and do some other progress reporting]
'''
End Sub
I'm unhappy about scrolling, it feels very 'manual', especially when you are scrolling by a fixed value (any point making it dynamic? i.e. finding the end of the page and scrolling to there).
But the main problem is that it doesn't work: when I execute the code, it counts the first 100 (or fewer) images fine. But when it's supposed to scroll and count some more, I get the value of 100 returned. Slowly stepping through the code with F8 I get the proper numbers (max 400), which leads me to conclude that the code is running through too quickly (I may be wrong).
To slow the code down I tried adding the objIE.readyState
check loop, but because I'm only scrolling I don't think it counts as the page 're-loading' so the loop is ineffective in waiting for the new images to load.
I've thought about adding in a time delay instead. I am already employing
Private Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
elsewhere in the worksheet - so, I could add as small as a millisecond-order delay.
But I really want to avoid using that, as this code runs for c. 50 different searches and takes long enough to execute already, adding in fixed delays that are long enough to accommodate slow connection speeds would not be ideal. Also, internet speeds vary so much that a fixed delay is very unreliable - I could carry out some kind of connection test to get a better ball-park figure, but the best option is obviously only to wait as long as you have to.
Or better still find a different way of counting the images, preferably one which doesn't involve re-loading the page 4 times! Any ideas?
NB. If you want to debug yourself, a good image search to set fullUrl
to might be https://www.google.com/search?q=stack overflow|exchange&tbm=isch&source=lnt&tbs=isz:ex,iszw:312,iszh:390
as it returns >100 images but fewer than 400 so you can test all aspects of the code