0

I have 3000 ONIX-style xml URLs with book metadata. I would like to filter a few nodes out of these files and parse them into Excel rows in a single worksheet.

This is an example of a XML-path: http://btsoep.appspot.com/rest/book/9789082516425

I would like to filter this XML data on things as

<Product>
 <Title>
  <TitleText>

and

<Product>
 <Contributor>
  <PersonName>

I need the retrieved Data in an Excel worksheet where every URL has it's own row. So the URLs are in row A, TitleText is in row B and PersonName is in row C.

How would I go by doing this?

EDIT 1:

What I've tried so far is: -using wget to download all the xml data first and then try to bulk parse this is excel. This could work but is unnecessary. -using the default XMLimport function in Excel. I can't seem to run this in a batch.

I don't have any experience with filtering XML files and parsing this in an Excel worksheet. I'm not asking anyone to fix this for me or write code for me, but I would like a step in the good direction. Which tools would be the best to use for this case? Thanks again.

  • start by doing internet search for `vba parse xml` .... you should have done this before posting a question here – jsotola Sep 03 '17 at 16:11
  • Easy as cake...simply loop through list of URLs and parse xpaths to those node values iteratively writing to cells by row and column. OPs have all the fun! – Parfait Sep 03 '17 at 20:48
  • 1
    Now that we know what you need, please show us what you have tried so far. Because (a) this is no free code writing service (b) questions asking for how to implement something without relating to a specific code issue are off-topic. – Pᴇʜ Sep 04 '17 at 06:40
  • @jsotola Thanks for your answer. I've been trying all kinds of stuff for the last 3 days, but I can't seem to get further or find help on existing threads. A step in the right direction would be great. – Christian Kasius Sep 04 '17 at 09:06
  • @Parfait Thanks! And which tools would you use to loop and parse those XML nodes? Is this all possible in VBA? – Christian Kasius Sep 04 '17 at 09:09
  • @Peh I'm new here, I''m not used to the stackoverflow etiquette yet. Thanks for your feedback! I'll edit the original post with additional information on methods I've tried so far. Thanks! – Christian Kasius Sep 04 '17 at 09:11
  • 1
    Possible duplicate of [How to parse XML using vba](https://stackoverflow.com/questions/11305/how-to-parse-xml-using-vba) There are thousands of tutorials how to parse XML with VBA. Go through them come back if you have a strategy and your first code. Then ask a specific question to your code (where you got stuck/errors etc). Questions on how to implement something are too broad to answer and therefore off-topic. – Pᴇʜ Sep 04 '17 at 10:14
  • this has code that reads xml from web http://itpscan.info/blog/excel/VBA-yahoo-finance-web-services-1.php – jsotola Sep 06 '17 at 01:10
  • video here https://www.youtube.com/watch?v=6H7tBL97orE – jsotola Sep 06 '17 at 01:12
  • Thanks for the help guys, I really appreciate it. I'll be back with a progress update. Thanks! – Christian Kasius Sep 06 '17 at 12:44

1 Answers1

0

this should get you started

Option Explicit

Sub parseONIX()

    Dim URL As String

    URL = "http://btsoep.appspot.com/rest/book/9789082516425"
 '  URL = "https://www.w3schools.com/xml/plant_catalog.xml"

    Dim XMLPage As New MSXML2.XMLHTTP60
    XMLPage.Open "GET", URL, False
    XMLPage.send

    Dim XMLDoc As New MSXML2.DOMDocument
    XMLDoc.LoadXML XMLPage.responseText

    Debug.Print XMLDoc.ChildNodes(0).BaseName
    Debug.Print XMLDoc.ChildNodes(1).BaseName
    Debug.Print XMLDoc.ChildNodes(1).ChildNodes(0).BaseName
    Debug.Print XMLDoc.ChildNodes(1).ChildNodes(1).BaseName
    Debug.Print XMLDoc.getElementsByTagName("Product").Item(0).BaseName

    Dim i As Integer

    For i = 0 To XMLDoc.getElementsByTagName("Measure").Length - 1

        Debug.Print "type: "; XMLDoc.getElementsByTagName("Measure")(i).ChildNodes(0).Text,
        Debug.Print XMLDoc.getElementsByTagName("Measure")(i).ChildNodes(1).Text,
        Debug.Print XMLDoc.getElementsByTagName("Measure")(i).ChildNodes(2).Text

    Next i



End Sub
jsotola
  • 2,238
  • 1
  • 10
  • 22