0

I gotta import multiple huge xml data files into Excel. I cannot use the simple loadXML() function since Excel doesn't have enough RAM available. (some of the xml files are ~100mb)

Now I've really tried out a lot... But couldn't really make it happen at all. Example XML File:

<OMDS xmlns="urn:omds20" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:omds20 ./omds24-00.xsd">
   <PAKET VUNr="1" MaklerID="2" PaketZpktErstell="x" PaketZpktLetztErstell="y">
      <PROVISION ProvisionsID="123" Polizzennr="321" Vermnr="5" BuchDat="2013-02-27"/>
      <PROVISION ProvisionsID="456" Polizzennr="321" Vermnr="5" BuchDat="2013-02-27"/>
      <PROVISION ProvisionsID="789" Polizzennr="321" Vermnr="5" BuchDat="2013-02-27"/>
   </PAKET>
</OMDS>

So what I have in VBA is something like that:

Sub ParseXmlDocument()
 Dim doc As New MSXML2.DOMDocument
 Dim success As Boolean

 success = doc.Load(App.Path & "\test.xml")
 If success = False Then
    MsgBox doc.parseError.reason
 Else
   Dim nodeList As MSXML2.IXMLDOMNodeList

   Set nodeList = doc.selectNodes("/OMDS/PAKET/PROVISION")

   If Not nodeList Is Nothing Then
     Dim node As MSXML2.IXMLDOMNode
     Dim idAs String
     Dim value As String

     For Each node In nodeList
        id= node.selectSingleNode("ProvisionsID").Text
     Next node
   End If
  End If
End Sub

After that I'm just trying to Print the ID within a MsgBox, but since the nodeList always appears to be empty, I can't make it happen.

Hope someone can help me out.


Thanks to GSerg i was able to solve the problem. Here the Solution

Sub ParseXmlDocument()
    Dim doc As New MSXML2.DOMDocument
    Dim success As Boolean

  With doc
    .async = False
    .setProperty "SelectionLanguage", "XPath"
    .setProperty "SelectionNamespaces", "xmlns:t='urn:omds20'"
  End With

  success = doc.Load("C:\...\demo.xml")

 If success = False Then
    MsgBox doc.parseError.reason
 Else

 Dim nodeList As MSXML2.IXMLDOMNodeList
 Set nodeList = doc.SelectNodes("/t:OMDS/t:PAKET/t:PROVISION")

   If Not nodeList Is Nothing Then
     Dim node As MSXML2.IXMLDOMNode
     Dim id As String
     Dim value As String

   For Each node In nodeList


    id = node.SelectSingleNode("@ProvisionsID").Text
   Next node
  End If
 End If
End Sub
Community
  • 1
  • 1
cosi
  • 3
  • 4
  • Your nodes belong to a namespace, but you don't provide any namespace in your xpath query. Also `Load` is async unless `.async = False` is used, so you are checking for success before loading completes. – GSerg Sep 25 '14 at 09:04
  • How can i provide the namespace for the xpath query? Working with XML for the first time sry :-/ Thanks for the tip with async :) – cosi Sep 25 '14 at 09:10
  • http://stackoverflow.com/q/2141181/11683 – GSerg Sep 25 '14 at 09:33
  • Sorry, but I really don't get it... – cosi Sep 25 '14 at 10:12
  • Read [my answer](http://stackoverflow.com/a/2142551/11683) to learn how to set required property on your `doc`, read [Robert Rossney's answer](http://stackoverflow.com/a/2145276/11683) to learn why. – GSerg Sep 25 '14 at 10:26

1 Answers1

1

Your source XML contains namespaces, but your xPath query does not. So the xPath will be looking for nodes with empty namespace, and you don't have any.

In order to fix it you need to provide a namespace in your xPath query. Ways to do that differ based on the XML library used. For MSXML, you need to set the SelectionNamespaces property on the DOMDocument object to include your namespace with a prefix:

doc.setProperty("SelectionNamespaces", "xmlns:t='urn:omds20'")

And then change your query to use that prefix:

Set nodeList = doc.selectNodes("/t:OMDS/t:PAKET/t:PROVISION")
Community
  • 1
  • 1
GSerg
  • 76,472
  • 17
  • 159
  • 346