Im trying to scrape data from website: http://uk.investing.com/rates-bonds/financial-futures via vba, like real-time price, i.e. German 5 YR Bobl, US 30Y T-Bond, i have tried excel web query but it only scrapes the whole website, but I would like to scrape the rate only, is there a way of doing this?
-
You need to use DOM/XML parser then you can either iterate the collection of elements or refer to specific elements by their xpath. – David Zemens Nov 21 '14 at 17:28
5 Answers
There are several ways of doing this. This is an answer that I write hoping that all the basics of Internet Explorer automation will be found when browsing for the keywords "scraping data from website", but remember that nothing's worth as your own research (if you don't want to stick to pre-written codes that you're not able to customize).
Please note that this is one way, that I don't prefer in terms of performance (since it depends on the browser speed) but that is good to understand the rationale behind Internet automation.
1) If I need to browse the web, I need a browser! So I create an Internet Explorer browser:
Dim appIE As Object
Set appIE = CreateObject("internetexplorer.application")
2) I ask the browser to browse the target webpage. Through the use of the property ".Visible", I decide if I want to see the browser doing its job or not. When building the code is nice to have Visible = True
, but when the code is working for scraping data is nice not to see it everytime so Visible = False
.
With appIE
.Navigate "http://uk.investing.com/rates-bonds/financial-futures"
.Visible = True
End With
3) The webpage will need some time to load. So, I will wait meanwhile it's busy...
Do While appIE.Busy
DoEvents
Loop
4) Well, now the page is loaded. Let's say that I want to scrape the change of the US30Y T-Bond: What I will do is just clicking F12 on Internet Explorer to see the webpage's code, and hence using the pointer (in red circle) I will click on the element that I want to scrape to see how can I reach my purpose.
5) What I should do is straight-forward. First of all, I will get by the ID property the tr
element which is containing the value:
Set allRowOfData = appIE.document.getElementById("pair_8907")
Here I will get a collection of td
elements (specifically, tr
is a row of data, and the td
are its cells. We are looking for the 8th, so I will write:
Dim myValue As String: myValue = allRowOfData.Cells(7).innerHTML
Why did I write 7 instead of 8? Because the collections of cells starts from 0, so the index of the 8th element is 7 (8-1). Shortly analysing this line of code:
.Cells()
makes me access thetd
elements;innerHTML
is the property of the cell containing the value we look for.
Once we have our value, which is now stored into the myValue
variable, we can just close the IE browser and releasing the memory by setting it to Nothing:
appIE.Quit
Set appIE = Nothing
Well, now you have your value and you can do whatever you want with it: put it into a cell (Range("A1").Value = myValue
), or into a label of a form (Me.label1.Text = myValue
).
I'd just like to point you out that this is not how StackOverflow works: here you post questions about specific coding problems, but you should make your own search first. The reason why I'm answering a question which is not showing too much research effort is just that I see it asked several times and, back to the time when I learned how to do this, I remember that I would have liked having some better support to get started with. So I hope that this answer, which is just a "study input" and not at all the best/most complete solution, can be a support for next user having your same problem. Because I have learned how to program thanks to this community, and I like to think that you and other beginners might use my input to discover the beautiful world of programming.
Enjoy your practice ;)

- 11,930
- 12
- 52
- 89
-
1Definitely a great answer from someone with years of VBA experience but never thought of doing this. And it may help me answer someone else's question. – Rick Henderson Feb 19 '16 at 03:29
-
Another questioner wanted more, so I gave it to him based on your starting point http://stackoverflow.com/questions/41848354/website-data-table-scraper/41902620#41902620 – S Meaden Jan 27 '17 at 20:40
Other methods were mentioned so let us please acknowledge that, at the time of writing, we are in the 21st century. Let's park the local bus browser opening, and fly with an XMLHTTP GET request (XHR GET for short).
XHR is an API in the form of an object whose methods transfer data between a web browser and a web server. The object is provided by the browser's JavaScript environment
It's a fast method for retrieving data that doesn't require opening a browser. The server response can be read into an HTMLDocument and the process of grabbing the table continued from there.
Note that javascript rendered/dynamically added content will not be retrieved as there is no javascript engine running (which there is in a browser).
In the below code, the table is grabbed by its id cr1
.
In the helper sub, WriteTable
, we loop the columns (td
tags) and then the table rows (tr
tags), and finally traverse the length of each table row, table cell by table cell. As we only want data from columns 1 and 8, a Select Case
statement is used specify what is written out to the sheet.
Sample webpage view:
Sample code output:
VBA:
Option Explicit
Public Sub GetRates()
Dim html As HTMLDocument, hTable As HTMLTable '<== Tools > References > Microsoft HTML Object Library
Set html = New HTMLDocument
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://uk.investing.com/rates-bonds/financial-futures", False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT" 'to deal with potential caching
.send
html.body.innerHTML = .responseText
End With
Application.ScreenUpdating = False
Set hTable = html.getElementById("cr1")
WriteTable hTable, 1, ThisWorkbook.Worksheets("Sheet1")
Application.ScreenUpdating = True
End Sub
Public Sub WriteTable(ByVal hTable As HTMLTable, Optional ByVal startRow As Long = 1, Optional ByVal ws As Worksheet)
Dim tSection As Object, tRow As Object, tCell As Object, tr As Object, td As Object, r As Long, C As Long, tBody As Object
r = startRow: If ws Is Nothing Then Set ws = ActiveSheet
With ws
Dim headers As Object, header As Object, columnCounter As Long
Set headers = hTable.getElementsByTagName("th")
For Each header In headers
columnCounter = columnCounter + 1
Select Case columnCounter
Case 2
.Cells(startRow, 1) = header.innerText
Case 8
.Cells(startRow, 2) = header.innerText
End Select
Next header
startRow = startRow + 1
Set tBody = hTable.getElementsByTagName("tbody")
For Each tSection In tBody
Set tRow = tSection.getElementsByTagName("tr")
For Each tr In tRow
r = r + 1
Set tCell = tr.getElementsByTagName("td")
C = 1
For Each td In tCell
Select Case C
Case 2
.Cells(r, 1).Value = td.innerText
Case 8
.Cells(r, 2).Value = td.innerText
End Select
C = C + 1
Next td
Next tr
Next tSection
End With
End Sub

- 83,427
- 12
- 54
- 101
-
I personally also prefer the XMLHTTP request, but a couple of things must be added for this method: 1) the request picks the values from the cache, so for financial values it is necessary to force the refresh of the request everytime (sometime this is achievable by passing a random string along with the URL, but that doesn't always work). 2) With any page contained in a frame, it's impossible to use the XMLHTTP request. And, by experience, most of these data (not the one of the above example) are usually embedded into a frame as they are provided by external providers. – Matteo NNZ Sep 13 '18 at 11:51
-
But a +1 is still well deserved for this answer, which for this specific use case (after adding a random string to the URL to avoid picking up from cache) is much cleaner than simply putting a robot behind the browser. – Matteo NNZ Sep 13 '18 at 11:53
-
@MatteoNNZ That's very kind. Can you tell me more about the caching and using random string added to url please? Would you add this to the end of each URL? How would that work? With caching problems before I have used .setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT" – QHarr Sep 13 '18 at 12:05
-
For a frame it depends - I would normally use selenium for dealing with frames but it may be possible to parse a new url from the frame and use that in a second http request . Sometimes this is an option. – QHarr Sep 13 '18 at 12:07
-
If happy to discuss I would then like to include the points discussed as an edit to the answer if you are happy with that? Looks like some pointers [here](https://stackoverflow.com/questions/3055475/vbscript-disable-caching-of-response-from-server-to-http-get-url-request) – QHarr Sep 13 '18 at 12:08
-
I generally have never had this caching problem but have had a user, using a script of mine, have this problem and adding the SetRequestHeader resolved this. I believe the cache problem sits with the browser rather than the object itself but some of the debate on that link above points towards using WinHTTP as a safer method. Thoughts @MatteoNNZ? – QHarr Sep 13 '18 at 12:15
-
For the caching issue, it happened several times for me (I work in financial fields and need to retrieve real-time values every few seconds). What I usually do is just I append a random number to the url string (no delimiter), and the back-end of those sites is usually able to isolate the request and take out the crap (if instead the request is still the same, then it routes directly to the cache without actually even reaching the back-end of the provider). – Matteo NNZ Sep 13 '18 at 13:00
-
@MatteoNNZ Thanks. Did you try either WinHTTP or .setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT" ? – QHarr Sep 13 '18 at 13:02
-
1Regarding the frame, I've tried a couple of times to do what you say... but not always succeeded (often, especially with newer websites as Angular, the HTML content is almost entirely rendered by the TypeScript/JavaScript and the GET request XMLHTTP can't read it as it doesn't actually execute the scripting behind). – Matteo NNZ Sep 13 '18 at 13:02
-
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/179995/discussion-between-qharr-and-matteo-nnz). – QHarr Sep 13 '18 at 13:03
-
I've been scraping with XHR since 2005, but sometime after upgrading to Windows 10 all my XHR-based Excel sheets stopped working, and I haven't been able to figure out why. Has anyone experienced this? – thdoan May 13 '19 at 03:37
-
@thdoan I will get back to you later today. If you haven't heard from me by 14/5/19 then ping me here or in [dawghaus](https://chat.stackoverflow.com/rooms/169987/dawgs-waffle-haus) – QHarr May 13 '19 at 06:17
you can use winhttprequest object instead of internet explorer as it's good to load data excluding pictures n advertisement instead of downloading full webpage including advertisement n pictures those make internet explorer object heavy compare to winhttpRequest object.

- 19
- 1
This question asked long before. But I thought following information will useful for newbies. Actually you can easily get the values from class name like this.
Sub ExtractLastValue()
Set objIE = CreateObject("InternetExplorer.Application")
objIE.Top = 0
objIE.Left = 0
objIE.Width = 800
objIE.Height = 600
objIE.Visible = True
objIE.Navigate ("https://uk.investing.com/rates-bonds/financial-futures/")
Do
DoEvents
Loop Until objIE.readystate = 4
MsgBox objIE.document.getElementsByClassName("pid-8907-last")(0).innerText
End Sub
And if you are new to web scraping please read this blog post.
And also there are various techniques to extract data from web pages. This article explain few of them with examples.

- 146
- 8
I modified some thing that were poping up error for me and end up with this which worked great to extract the data as I needed:
Sub get_data_web()
Dim appIE As Object
Set appIE = CreateObject("internetexplorer.application")
With appIE
.navigate "https://finance.yahoo.com/quote/NQ%3DF/futures?p=NQ%3DF"
.Visible = True
End With
Do While appIE.Busy
DoEvents
Loop
Set allRowofData = appIE.document.getElementsByClassName("Ta(end) BdT Bdc($c-fuji-grey-c) H(36px)")
Dim i As Long
Dim myValue As String
Count = 1
For Each itm In allRowofData
For i = 0 To 4
myValue = itm.Cells(i).innerText
ActiveSheet.Cells(Count, i + 1).Value = myValue
Next
Count = Count + 1
Next
appIE.Quit
Set appIE = Nothing
End Sub