I have been attempting to use IE automation to google search a string of text in Excel. I want to return the hyperlink for the website of the first result in another cell in excel. Is this possible? I have a list of 60,000 records that I need to google search and return the hyperlink for the website in the first result. Is there another approach to this that you would reccomend? I appreciate the help in advance.
2 Answers
As its 60,000 records i recommend use xmlHTTP object instead of using IE.
HTTP requests a easier, and a lot faster
Sub XMLHTTP()
Dim url As String, lastRow As Long, i As Long
Dim XMLHTTP As Object, html As Object, objResultDiv As Object, objH3 As Object, link As Object
Dim start_time As Date
Dim end_time As Date
lastRow = Range("A" & Rows.Count).End(xlUp).Row
Dim cookie As String
Dim result_cookie As String
start_time = Time
Debug.Print "start_time:" & start_time
For i = 2 To lastRow
url = "https://www.google.co.in/search?q=" & Cells(i, 1) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000)
Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
XMLHTTP.Open "GET", url, False
XMLHTTP.setRequestHeader "Content-Type", "text/xml"
XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
XMLHTTP.send
Set html = CreateObject("htmlfile")
html.body.innerHTML = XMLHTTP.ResponseText
Set objResultDiv = html.getelementbyid("rso")
Set objH3 = objResultDiv.getelementsbytagname("h3")
For Each link In objH3
If link.className = "r" Then
Cells(i, 2) = link.innerText
Cells(i, 3) = link.getelementsbytagname("a")(0).href
DoEvents
End If
Next
Next
end_time = Time
Debug.Print "end_time:" & end_time
Debug.Print "done" & "Time taken : " & DateDiff("n", start_time, end_time) & " :minutes"
MsgBox "done" & "Time taken : " & DateDiff("n", start_time, end_time)
End Sub
Using CSS3 Selector
Sub XMLHTTP1()
Dim url As String, i As Long, lastRow As Long
Dim XMLHTTP As Object, html As New HTMLDocument, objResultDiv As HTMLAnchorElement
lastRow = Range("A" & Rows.Count).End(xlUp).Row
For i = 2 To lastRow
url = "https://www.google.co.in/search?q=" & Cells(i, 1) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000)
Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
XMLHTTP.Open "GET", url, False
XMLHTTP.setRequestHeader "Content-Type", "text/xml"
XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
XMLHTTP.send
Set html = New HTMLDocument
html.body.innerHTML = XMLHTTP.ResponseText
Set objResultDiv = html.querySelector("div#rso h3.r a")
Cells(i, 2) = objResultDiv.innerText
Cells(i, 3) = objResultDiv.href
DoEvents
Next
End Sub
Output
HTH
Santosh

- 12,175
- 4
- 41
- 72
-
1Thanks! That works well except after about 100 records I recieve a Run Error 80070005 Access is denied. Any clues on why? – Collin Hendo Jul 08 '13 at 21:38
-
1@CollinHendo Nope, will have to see your data for that. Alternatively you can add 'On error resume next' on top of code. If the solution was helpful pls vote. – Santosh Jul 09 '13 at 00:47
-
I could provide you a sample of my data. How could I do that? – Collin Hendo Jul 09 '13 at 12:52
-
Actually the error occurs even if I attempt to use the code on the sample that you provided.Any hints? – Collin Hendo Jul 09 '13 at 13:19
-
It is a strange error. After running about 100-125 record, I get the access denied and cannot use it in any other workbook or anything. – Collin Hendo Jul 09 '13 at 13:49
-
Is it possible that my server cuts off access after pulling 100 or so records and then I am on a delay or cut off? – Collin Hendo Jul 09 '13 at 15:11
-
@CollinHendo Apologies for late response. You can share the workbook using dropbox or google drive and give me the link. I shall get back to you as feasible to me. – Santosh Jul 09 '13 at 18:10
-
https://www.dropbox.com/s/o1rj4fqht2I1ws6/Sample.xlsm Thanks! That's just a sample. As you can see, it ended after about 110 records. – Collin Hendo Jul 09 '13 at 19:37
-
@CollinHendo The link is not working for me. Please make sure you share the link publicly. – Santosh Jul 10 '13 at 03:35
-
My apologies. I thought it was in my public folder. Here is the correct link https://www.dropbox.com/s/t4qh8vo4g88ulu3/Sample.xlsm – Collin Hendo Jul 10 '13 at 10:16
-
I too get this error, just about after 60 entries, any resolutions to this please. – Vasim Nov 06 '13 at 07:58
-
@Vasim I will get back to you on this by tommorow. Can you take a screenshot of issue and give the link. – Santosh Nov 06 '13 at 09:41
-
[link](https://drive.google.com/file/d/0B9u_K1HbgiEKUk5JMlBZdWRzZFE/edit?usp=sharing), it stops at "xmlHttp.send" after sometime (5min or so) it again works - stops after some next 60 entries and so on..... – Vasim Nov 06 '13 at 10:16
-
@Vasim I was able to simulate the issue. With lack of time i will try to fix it in weekend. In the meanwhile you may try to add some delay or make async request. – Santosh Nov 08 '13 at 05:16
-
1@Vasim I have updated the code. Give it a try and let me know the response. I tested for about 140 records which ran uninterrupted. Alternatively you may download the sample file from [here](http://goo.gl/VudD5z) – Santosh Nov 14 '13 at 18:26
-
@Santosh I was trying the get the data of a span tag with class name "_Ex", the code doesn't seem to work. I used the following: `code` Set html = CreateObject("htmlfile") html.body.innerHTML = XMLHTTP.ResponseText Set objResultDiv = html.getelementbyId("rso") Set objH3 = objResultDiv.getelementsbyTagName("span")(0) Set link = objH3.findElementsbyClassName("_Ex") `code` – Krithi07 Apr 21 '16 at 08:10
-
Great script. +1. My question is, I have thousands of records. I want to check with google. Can i try without limitation ? If i try to thousands of records, Will google think me that DDOS attack like that ? Is it legit ? – Venkat Dec 11 '16 at 14:02
The links seem to be consistently within within H3 tags. Normally you might use something like the following to check until the page has loaded:
Private Declare Sub Sleep Lib "kernel32" (ByVal nMilliseconds As Long)
Sub UseIE()
Dim ie As Object
Dim thePage As Object
Dim strTextOfPage As String
Set ie = CreateObject("InternetExplorer.Application")
'ie.FullScreen = True
With ie
'.Visible = True
.Navigate "http://www.bbc.co.uk"
While Not .ReadyState = READYSTATE_COMPLETE '4
Sleep 500 'wait 1/2 sec before trying again
Wend
End With
Set thePage = ie.Document
'more code here
End Sub
However, I would, instead, repeatedly try to reference the A element within the first H3 using getElementsByTagName("H3")
, get the first of these elements, then look within this for the A-link and its href-attribute.
In JavaScript the attempts to reference non-existent elements would return undefined
but from VBA it will probably need error-handling code.
Once I had obtained the href I would stop the navigation (not sure of the command for this, probably ie.Stop
) or navigate to the next page immediately.
The first link(s) will, however, often be sponsored links and the href returned is a little garbled. The text of these sponsored links appear to include em
tags. I might use this information to discard these links and look further down the page.
I don't know if there is a better way to do this.

- 19,232
- 5
- 47
- 69