0

I'm trying to extract values stored inside different buttons on the webpage. It seems button of each variant has no name, they are just called "variant__box", which are under "variants" div class.
As far as I can tell, values are loaded by javascript on each variant__box.

This is the website to get the data: https://www.honda.co.uk/motorcycles/range/adventure/crf1100l-africa-twin-adventure-sports/specifications-and-price.html#/

This is the code I've written so far

Dim ie As Object
Dim html As New HTMLDocument
Dim address, str As String
Dim jobDetailsList As Object
Dim jobitem As Object

Set ie = CreateObject("InternetExplorer.Application")

ie.navigate address 'the one mentioned above
ie.Visible = False

While ie.Busy Or ie.readyState < 4
    DoEvents
Wend

Set html = ie.document
Set jobDetailsList = html.getElementsByClassName("variants")
                  
For Each jobitem In jobDetailsList
    jobitem.Click
    str = jobitem.innerText
    ActiveSheet.Cells(i, 5).Value = str
    i = i + 1
Next jobitem
    
Set html = Nothing
ie.Quit
Set ie = Nothing

It returns nothing.

Community
  • 1
  • 1
MarcoG
  • 5
  • 3
  • the buttons you refer to are `span` elements are they not and they cannot have a name attribute – Professor Abronsius Jan 26 '21 at 09:19
  • as far as i can see, these buttons are inside a div class. values of each variant are stored within a div class "variant__wrapper". When i try to access it using getelementsbyclass, i get nothing – MarcoG Jan 26 '21 at 09:28
  • it's years since I played with VBA - can't even remember how to get the references set to the HTMLDocument object – Professor Abronsius Jan 26 '21 at 09:36
  • looking at the page source directly there is no element directly labelled with class "variants" so it is added by javascript which means your web scrape will not work – Professor Abronsius Jan 26 '21 at 10:02
  • Try this [link](https://motorcycle-api.honda.eu/api/products/v1/en-GB/bikes/crf1100l_atas?status=conLive) to get json response using xmlhttp requests and then use regex or any third party json converter to scrape the content you are after. – SIM Jan 26 '21 at 10:10
  • Thank you @SIM! By accessing the link you provided i can at least see the values i was looking for. The problem is i have 0 experience with regex. Do you think that using the split function with ":" as delimiter will achieve somehow the same result? – MarcoG Jan 26 '21 at 13:21
  • Yeah, sure. You can check out [one](https://stackoverflow.com/questions/6627652/parsing-json-in-excel-vba/44596487#44596487) of my answers which is very related to this. Thanks. – SIM Jan 26 '21 at 13:24

1 Answers1

0

If you want to use the IE you can use the following code. But SIM's suggestion is better because IE is then omitted.

Sub ScrapeMotorCycleData()
  Dim ie As Object
  Dim address, str As String
  Dim jobDetailsList As Object
  Dim jobitem As Object
  Dim i As Long
  
  i = 2
  address = "https://www.honda.co.uk/motorcycles/range/adventure/crf1100l-africa-twin-adventure-sports/specifications-and-price.html#/"
  Set ie = CreateObject("InternetExplorer.Application")
  ie.navigate address 'the one mentioned above
  ie.Visible = False
  'The following line doesn't do what you want
  'While ie.Busy Or ie.readyState < 4: DoEvents: Wend
  
  'You nee a loop here to wait for loading the dynamic content
  'Ask for the HTML part you want to scrape
  '(No timeout included here, but it can be programmed)
  Do
    Set jobDetailsList = ie.document.getElementsByClassName("variant__wrapper")
  Loop Until jobDetailsList.Length > 0
  
  For Each jobitem In jobDetailsList
    ActiveSheet.Cells(i, 5).Value = jobitem.innerText
    i = i + 1
  Next jobitem
  
  ie.Quit
  Set ie = Nothing
End Sub
Zwenn
  • 2,147
  • 2
  • 8
  • 14