0

So I finally have to ask this from an expert after I found totally stuck in it. Here is the Macro Written by me. The main purpose is to fetch the Product Price and Product Name from Amazon Website and paste it in Excel column. All i need to add ASIN number of Product in column A1 manually. It will automatically go to the given link and collect data and paste it in respective columns i.e., column B1 and C1.

This is the example URL. I am trying to get the data from this URL for now.

https://www.amazon.co.uk/Alessi-Colombina-Soup-Plate-FM10/dp/B0012620YA/ref=sr_1_1?dchild=1&keywords=B0012620YA&qid=1603405464&sr=8-1

I tried to find the class of Price and Name in InspectElement, but I didn't find any class or ID working there. When I run macro it goes to site and loads for sometime but it doesn't fetch Product Price or Name from there and its totally unpredictable why it's not working.

I tried using getElementsbyClassName("") and getElementsbyID(""). All not working.

Is there any error ( logical ) in my code or I need to rewrite it? Actually I am building a macro. Any help from Experts is very appreciable. I have Microsoft HTML Object Library & Internet Controls Checked.

Here is the code i am using:

Sub useClassnames()

Dim element As IHTMLElement
Dim elements As IHTMLElementCollection
Dim IE As InternetExplorer

Dim html As HTMLDocument

Set IE = New InternetExplorer

IE.Visible = True

IE.navigate "https://www.amazon.co.uk/Alessi-Colombina-Soup-Plate-FM10/dp/B0012620YA/ref=sr_1_1?dchild=1&keywords=B0012620YA&qid=1603405464&sr=8-1"

'Wait until IE has loaded the web page

Do While IE.readyState <> READYSTATE_COMPLETE

Application.StatusBar = " Loading Web page … "

DoEvents

Loop

Set html = IE.document

Set elements = html.getElementsByClassName(" priceblock_ourprice ")

'Here is the className "priceblock_ourpirce" I found it in InspectElement but its not getting any data. I am not able to find the 'correct ClassName for Price and Name of Product from InspectElemnt. As I am not a web developer.

Dim count As Long

Dim erow As Long

count = 0

For Each element In elements

If element.className = “priceblock_ourprice” Then

erow = Sheet1.Cells(Rows.count, 1).End(xlUp).Offset(1, 0).Row

Cells(erow, 1) = html.getElementsById(“priceblock_ourprice”)(count).innerValue

Cells(erow, 2) = html.getElementsById(" productTitle ")(count).innerText

count = count + 1

End If

Next element

Range("A2:B2").Select

Columns("A:A ").EntireColumn.AutoFit ""

Columns("B:B").ColumnWidth = 36

MsgBox count

End Sub
Tim Williams
  • 154,628
  • 8
  • 97
  • 125

2 Answers2

1

GetElement s ByID

  • I wrote the function getElementsByID the other day. The procedure testGEBI shows how to use it.
  • My knowledge of this subject is very limited, so be aware that the function may produce unwanted results. In my usage so far, it failed with an error message if there was no internet connection, or if an ID did not exist.
  • The current setup will write Alessi Colombina Soup Plate, Set of 6 (FM10/2) to A2 and £129.00 to B2 and will then autofit columns A and B.

The Code

Option Explicit

Sub testGEBI()
    
    Const First As String = "A2"
    Const URL  As String = "https://www.amazon.co.uk/Alessi-Colombina-Soup-" _
                         & "Plate-FM10/dp/B0012620YA/ref=sr_1_1?dchild=1&" _
                         & "keywords=B0012620YA&qid=1603405464&sr=8-1"
    Dim Elements As Variant
    Elements = Array("productTitle", "priceblock_ourprice")
    
    Dim Data As Variant
    Data = getElementsByID(Elements, URL)
    
    With Range(First)
        .Resize(, UBound(Data, 2)).Value = Data
        Columns(.Column).Resize(, UBound(Data, 2)).AutoFit
    End With

End Sub

Function getElementsByID(Elements As Variant, _
                         ByVal URL As String, _
                         Optional ByVal getColumn As Boolean = False) _
         As Variant
    
    Dim rText As String
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", URL, False
        .setRequestHeader "Content-Type", "text/xml"
        .Send
        rText = .responseText
    End With
        
    Dim j As Long
    With CreateObject("htmlfile")
        .body.innerHTML = rText
        Dim Offs As Long
        Offs = LBound(Elements) - 1
        Dim ElementsCount As Long
        ElementsCount = UBound(Elements) - Offs
        Dim Data As Variant
        If Not getColumn Then
            ReDim Data(1 To 1, 1 To ElementsCount)
            For j = 1 To ElementsCount
                Data(1, j) = WorksheetFunction _
                  .Trim(.getElementById(Elements(j + Offs)).innerText)
            Next j
        Else
            ReDim Data(1 To ElementsCount, 1 To 1)
            For j = 1 To ElementsCount
                Data(j, 1) = WorksheetFunction _
                  .Trim(.getElementById(Elements(j + Offs)).innerText)
            Next j
        End If
    End With
    
    getElementsByID = Data
    
End Function
VBasic2008
  • 44,888
  • 5
  • 17
  • 28
  • Thanks! for the code. Really appreciable. But its not working for me. There is an error message of " Object variable or with block Variable not set". – Salman.kodeforest Oct 23 '20 at 07:58
  • The error occurs if the ID in the `Elements` array does not exist on the page or if the `URL` is wrong. Can you share the IDs and the URL you're currently using? BTW this code works without any references. – VBasic2008 Oct 23 '20 at 08:21
  • Thankyou very Much for your Help! Without any reference? I think the reference you are talking about is the URL from where it has to fetch data. Here is the URL from where i want to get the price and name of Product. Awaiting for your kind reply. https://www.amazon.co.uk/Alessi-Colombina-Soup-Plate-FM10/dp/B0012620YA/ref=sr_1_1?dchild=1&keywords=B0012620YA&qid=1603405464&sr=8-1 – Salman.kodeforest Oct 23 '20 at 09:55
  • The references I talk about are e.g. *Microsoft HTML Object Library & Internet Controls*. You don't need them with this code. Open a new workbook. In VBE add a new module and paste the complete code into it. Now if you run the `testGEBI` procedure, then in the `ActiveSheet` in cells `A2` and `B2`, the code will display the data requested. You can even use this shortened URL: `https://www.amazon.co.uk/Alessi-Colombina-Soup-Plate-FM10/dp/B0012620YA`. It works for me, I've done it 10 times: `Alessi Colombina Soup Plate, Set of 6 (FM10/2)` and `£129.00` are the results. – VBasic2008 Oct 23 '20 at 10:29
  • @Zwenn: Thanks for the confirmation that it works. The code is flexible i.e. you can add more IDs and they will be written in the same row cell after cell. But it only works for IDs. – VBasic2008 Oct 23 '20 at 10:36
  • Yes, I understand the code. In my answer I used a code that is "easier to read" for beginners. The way via xhr is better than via IE. But harder to understand if you don't have much experience with web scraping. – Zwenn Oct 23 '20 at 10:40
  • @Zwenn: Sorry, I didn't refresh the page. I would ask you about my function. Do those objects have to be somehow unloaded or set to `Nothing` similarly to e.g. `IE.Quit` or is that all. And maybe if you could share a link where I could learn more about `xhr`. I'm a total beginner when it comes to web scraping. – VBasic2008 Oct 23 '20 at 10:51
  • 1
    This explains why you do not catch runtime errors if the ID does not exist. I didn't understand that at the beginning too. But it is quite simple. `getElementByID()` does not create a NodeCollection, but a single object. If the object cannot be created, VBA throws a runtime error. `getElementsByTagName()` and `getElementsByClassName()` create NodeCollections, in which all found HTML elements are stored. If no elements are found, there is no runtime error, because the NodeCollection is generated in any case. But if in doubt, the Length property is 0. – Zwenn Oct 23 '20 at 11:28
  • See here how you can catch runtime errors for IDs: https://stackoverflow.com/questions/63294113/automate-ie-via-excel-to-fill-in-a-dropdown-and-continue/63299608#63299608 Unfortunately I have no link for bundled information on xhr. I am still working my way into this topic too. Much of what I know about web scraping I learned on SO. Especially from QHarr and a whole new way via xhr from SIM: https://stackoverflow.com/questions/64017476/scraping-specific-data-inside-a-table-ii/64024088#64024088 – Zwenn Oct 23 '20 at 11:28
  • @Zwenn: Those links are very useful. Among other stuff, now I can implement error handling into my function. And there are links to other web pages. Thanks a lot. Special thanks for the ASIN info. That opens a whole new world of possibilities. – VBasic2008 Oct 23 '20 at 11:51
  • 1
    I am happy if the links will take you further :-) I forgot to say something about `Set Nothing`. So `ie.Quit` is a must, otherwise the IE stays in memory because it is a third party application. But VBA internal object variables don't need to be reset. They are destroyed by the garbage collector when a function or sub is terminated. I only do this out of old habit. My first programming language was C. There wrong memory management is punished immediately. – Zwenn Oct 23 '20 at 12:16
  • Hi Zwenn, I tried to run the code by changing the URL. And it gives me the error message " Object variable or With Block not set" I am using the same code you mentioned. It was working before it. But Now when i made some URL changings, its not working. Is there something wrong with my sheet. As the code was running well at first. Then i tried another URL and NOW I am using the exactly same URL as I was you mentioned in your code, but its giving Error too. What's Wrong with this? Any idea? – Salman.kodeforest Oct 23 '20 at 16:58
1

I think the problem is not to scrape the values you want. I Think you have to learn another way to get information how you can scrape values from webpages.

You try to get the information about class names and IDs by inspecting the elements. A better way is, to press F12 in the browser (but don't use the IE for that). I use FireFox. After pressing F12 different tools open at the bottom of the page. The first tab of this area is "Inspector".

You can see the HTML code. But it is organized with arrows you can click to enlarge HTML code for the different elements. When you place the mouse over the different lines of HTML you can see in the upper area a blue overlay of the corresponding element. Now you can click through the arrow levels until you reach the element you search for.

Look there for an ID or class name. If there is nothig you can use to scrape the element it is helpfull to look the HTML levels obove to fence the HTML you are working with. For your little project, you can use IDs to scrape title and price.

Before I show you the VBA code to do that, two more things:

  1. You wrote, you only need the ASIN. That is right but you don't do that in your code. You use a url with the title you wana scrape. But I have good news for you. You can use realy only the ASIN and nothing else: https://www.amazon.co.uk/dp/B0012620YA shows the same page like https://www.amazon.co.uk/Alessi-Colombina-Soup-Plate-FM10/dp/B0012620YA/ref=sr_1_1?dchild=1&keywords=B0012620YA&qid=1603405464&sr=8-1

  2. On Amazon often different sellers with different prices use the same offer. Look in your linked offer for the line: New (3) from £129.00 + FREE Shipping If you click the link a page with the seller overview will open. If you need all the prices and sellernames, you have to do a lot more work.

Here is the VBA code for scraping the title and the price which is shown in the offer:

Sub ScrapeAmazonOffers()

Dim url As String
Dim ie As Object
Dim nodeTitle As Object
Dim nodePrice As Object
Dim resultExample As String

  'Amazon offer url, only with ASIN
  url = "https://www.amazon.co.uk/dp/B0012620YA"

  'Initialize Internet Explorer, set visibility,
  'call URL and wait until page is fully loaded
  Set ie = CreateObject("InternetExplorer.Application")
  ie.Visible = True
  ie.navigate url
  Do Until ie.readyState = 4: DoEvents: Loop
  
  'Get offer title
  Set nodeTitle = ie.document.getElementByID("productTitle")
  resultExample = nodeTitle.innertext
  
  'Get price
  Set nodePrice = ie.document.getElementByID("priceblock_ourprice")
  resultExample = resultExample & Chr(13) & nodePrice.innertext
  
  'Clean up
  ie.Quit
  Set ie = Nothing
  Set nodeTitle = Nothing
  Set nodePrice = Nothing
  
  'Show result for this example
  MsgBox resultExample
End Sub
Zwenn
  • 2,147
  • 2
  • 8
  • 14
  • Thanks for this macro. Running this macro code will take the user to website. Is that okay? And one thing I want to mention. As I am trying to to buid a macro to directly get data Price and Name of Product from "Amazon.co.uk" website. Here in the code I am giving the dummy URL to fetch data. I am trying to build a macro to perform search automatically and paste data in Column B1 and C1 after I put ASIN in code in column A1. Is there any method or proxy to do that? Or its not possible with current situation? As I am stuck in it from 5 days. Thanks for you help and sharing knowledge. – Salman.kodeforest Oct 23 '20 at 17:17