3

I have an XML file that I am trying to parse, whose contents are exactly the XML below:

<Results xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Reference>{REFERENCE-HERE}</Reference>
  <FillerTags>Filler</FillerTags>
  <entity>
    <entityName>ABC</entityName>
    <entityId>012345</entityId>
  </entity>
  <Items>
     <Item>
      <FillerTagsAgain>Filler2</FillerTagsAgain>
      <FillerTagsAgain>Filler2</FillerTagsAgain>
      <FillerTagsAgain>Filler2</FillerTagsAgain> 
     </Item>
     <AnotherItem> 
       <FillerTagsAgain>Filler2</FillerTagsAgain>
       <FillerTagsAgain>Filler2</FillerTagsAgain>
       <FillerTagsAgain>Filler2</FillerTagsAgain> 
     </AnotherItem>
   </Items>
</Results>

I have been trying to get the code below (initially from my question here) to work. Several other users (including the code's creator) have been able to use it successfully, but when I run it the output file simply comes out as ÿþ<. I made sure to encode the file as ANSI and immediately save the Notepad file as .xml but the output still only has the byte order mark.

Option Explicit
Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Sub ParseResults()
'Requires reference to Microsoft XML, v6.0
'Requires referenc to Microsoft Scripting Runtime
Dim xmlFilePath$, newFilePath$
Dim DOM As MSXML2.DOMDocument
Dim entity As IXMLDOMNode
Dim fso As Scripting.FileSystemObject

'# Define the file you are going to load as XML
xmlFilePath = "PATH"

'# Define an output path for where to put the modified XML
newFilePath = "NEWPATH"

'# Create our DOM object
Set DOM = CreateObject("MSXML2.DOMDocument")

'# Load the XML file
DOM.Load xmlFilePath

'# Wait until the Document has loaded
Do
    Sleep 250
Loop Until DOM.readyState = 4

'# Get the entityID node
Set entity = DOM.DocumentElement.getElementsByTagName("entityId")(0)

'# Call a subroutine to append the entity to "Item" tags
AppendEntity DOM, "Item", entity
'# Call a subroutine to append the entity to "AnotherItem" tags
AppendEntity DOM, "AnotherItem", entity

'## Create an FSO to write the new file
Set fso = CreateObject("Scripting.FileSystemObject")

'## Attempt to write the new/modified XML to file
On Error Resume Next
fso.CreateTextFile(newFilePath, True, True).Write DOM.XML
If Err Then
    '## Print the new XML in the Immediate window
    Debug.Print DOM.XML
    MsgBox "Unable to write to " & newFilePath & " please review XML in the Immediate window in VBE.", vbInformation
    Err.Clear
End If
On Error GoTo 0

'Cleanup
Set DOM = Nothing
Set fso = Nothing
Set entity = Nothing

End Sub

Sub AppendEntity(DOM As Object, tagName As String, copyNode As Object)
'## This subroutine will append child node to ALL XML Nodes matching specific string tag.
Dim itemColl As IXMLDOMNodeList
Dim itm As IXMLDOMNode

'# Get a collection of all elements matching the tagName
Set itemColl = DOM.DocumentElement.getElementsByTagName(tagName)

'# Iterate over the collection, appending the copied node
For Each itm In itemColl
    If itm.HasChildNodes Then
        '# Insert this node before the first child node of Item
        itm.InsertBefore copyNode.CloneNode(True), itm.FirstChild
    Else
        '# Append this node to the Item
        itm.appendChild copyNode.CloneNode(True)
    End If
Next

Set itm = Nothing
Set itemColl = Nothing

End Sub

To be sure, the code produces no errors - it creates a new file, but the file it creates is incorrect. The correct output should be (and is for some others who have tried this code) this:

<Results xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Reference>{REFERENCE-HERE}</Reference>
  <FillerTags>Filler</FillerTags>
  <entity>
    <entityName>ABC</entityName>
    <entityId>012345</entityId>
  </entity>
  <Items>
    <Item>
      <entityId>012345</entityId>
      <FillerTagsAgain>Filler2</FillerTagsAgain>
      <FillerTagsAgain>Filler2</FillerTagsAgain>
      <FillerTagsAgain>Filler2</FillerTagsAgain> 
     </Item>
     <AnotherItem> 
       <entityId>012345</entityId>
       <FillerTagsAgain>Filler2</FillerTagsAgain>
       <FillerTagsAgain>Filler2</FillerTagsAgain>
       <FillerTagsAgain>Filler2</FillerTagsAgain> 
     </AnotherItem>
   </Items>

That is, the code inserts the as a child node of each tag. In the real XML document that I am looking to apply this code to after I can get it to work on this example XML is much the same, but contains multiple entities. For example:

   <Results xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
      <Reference>{REFERENCE-HERE}</Reference>
      <FillerTags>Filler</FillerTags>
      <entity>
        <entityName>ABC</entityName>
        <entityId>012345</entityId>
      </entity>
      <Items>
        <Item>
          <entityId>012345</entityId>
          <FillerTagsAgain>Filler2</FillerTagsAgain>
          <FillerTagsAgain>Filler2</FillerTagsAgain>
          <FillerTagsAgain>Filler2</FillerTagsAgain> 
         </Item>
         <AnotherItem> 
           <entityId>012345</entityId>
           <FillerTagsAgain>Filler2</FillerTagsAgain>
           <FillerTagsAgain>Filler2</FillerTagsAgain>
           <FillerTagsAgain>Filler2</FillerTagsAgain> 
         </AnotherItem>
       </Items>
     <entity>
      <entityName>DEF</entityName>
        <entityId>678910</entityId>
      </entity>
      <Items>
        <Item>
          <entityId>678910</entityId>
          <FillerTagsAgain>Filler2</FillerTagsAgain>
          <FillerTagsAgain>Filler2</FillerTagsAgain>
          <FillerTagsAgain>Filler2</FillerTagsAgain> 
         </Item>
         <AnotherItem> 
           <entityId>678910</entityId>
           <FillerTagsAgain>Filler2</FillerTagsAgain>
           <FillerTagsAgain>Filler2</FillerTagsAgain>
           <FillerTagsAgain>Filler2</FillerTagsAgain> 
         </AnotherItem>
       </Items>   

Any help troubleshooting this problem would be much appreciated.

UPDATE:

The code above now works by changing the line fso.CreateTextFile(newFilePath, True, True).Write DOM.XML to fso.CreateTextFile(newFilePath, True, False).Write DOM.XML.

I am now attempting to run this on the larger set of XML data but receive an error on the line Set entity = DOM.DocumentElement.getElementsByTagName("entityId")(0)

I had this error on the example file a few times and realized I had just forgotten to set the correct directory, but despite ensuring the directory is correct this time the error persists.

UPDATE 2: The code I am receiving this error on has been modified as below. If I'm not mistaken all I did was rename a few things, but I may be wrong.

Option Explicit
Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Sub ParseResults()
'Requires reference to Microsoft XML, v6.0
'Requires referenc to Microsoft Scripting Runtime
Dim xmlFilePath$, newFilePath$
Dim DOM As MSXML2.DOMDocument
Dim Customer As IXMLDOMNode
Dim fso As Scripting.FileSystemObject

'# Define the file you are going to load as XML
xmlFilePath = "C:\FAKEPATH\Final_Test.xml"

'# Define an output path for where to put the modified XML
newFilePath = "C:\FAKEPATH\Final_Test1.xml"

'# Create our DOM object
Set DOM = CreateObject("MSXML2.DOMDocument.6.0")

'# Load the XML file
DOM.Load xmlFilePath

'# Wait until the Document has loaded
Do
    Sleep 250
Loop Until DOM.readyState = 4

'# Get the entityID node
Set Customer = DOM.DocumentElement.getElementsByTagName("CustomerId")(0)

'# Call a subroutine to append the entity to "Item" tags
AppendCustomer DOM, "Transaction", Customer

'## Create an FSO to write the new file
Set fso = CreateObject("Scripting.FileSystemObject")

'## Attempt to write the new/modified XML to file
On Error Resume Next
'MsgBox DOM.XML
fso.CreateTextFile(newFilePath, True, False).Write DOM.XML
If Err Then
    '## Print the new XML in the Immediate window
    Debug.Print DOM.XML
    MsgBox "Unable to write to " & newFilePath & " please review XML in the Immediate window in VBE.", vbInformation
    Err.Clear
End If
On Error GoTo 0

'Cleanup
Set DOM = Nothing
Set fso = Nothing
Set Customer = Nothing

End Sub

Sub AppendCustomer(DOM As Object, Transaction As String, copyNode As Object)
'## This subroutine will append child node to ALL XML Nodes matching specific string tag.
Dim itemColl As IXMLDOMNodeList
Dim itm As IXMLDOMNode

'# Get a collection of all elements matching the tagName
Set itemColl = DOM.DocumentElement.getElementsByTagName(Transaction)

'# Iterate over the collection, appending the copied node
For Each itm In itemColl
    If itm.HasChildNodes Then
        '# Insert this node before the first child node of Item
        itm.InsertBefore copyNode.CloneNode(True), itm.FirstChild
    Else
        '# Append this node to the Item
        itm.appendChild copyNode.CloneNode(True)
    End If
Next

Set itm = Nothing
Set itemColl = Nothing

End Sub

UPDATE 3: Everything now works perfectly. The only issue is in the actual procedure pursued by the code above. Since there are multiple entities, and each set of items belongs to an entity, the code needs to find an entityId and apply this entityId to all the items that come BEFORE another occurrence of an entityId tag. After this point everything would repeat.

Community
  • 1
  • 1
114
  • 876
  • 3
  • 25
  • 51
  • 1
    Might I point you to this answer: http://stackoverflow.com/a/3871182/1208914 Could I also recommend adding some tags so that people can easily find your question – stormCloud May 23 '14 at 21:42
  • @stormCloud Thanks, this looks useful. I only tagged xml because I wasn't sure what else was relevant as a tag. – 114 May 24 '14 at 03:37
  • @stormCloud I have read over the answer but am not sure how to apply that answer to my current situation. Would you be able to elaborate? – 114 May 26 '14 at 21:42
  • Once you've loaded the xml document, You would do something like doc.selectNode("/Results/entity/entityName").text to get the value of specific nodes. What I meant when I said to add some tags was to include what language you are doing this in (I assume vb6?) As soon as you include what language you are doing this in, you will get all the people who know that language looking at it :) – stormCloud May 27 '14 at 16:52
  • @stormCloud That makes perfect sense, I should have considered that. Thanks! – 114 May 27 '14 at 18:14
  • @stormCloud I don't think the problem (see 114's [previous question](http://stackoverflow.com/questions/23568074/inserting-a-unique-id-in-specific-row-locations-for-multiple-ids/23637932#23637932)) is "how to read XML" , but rather a peculiar error that FSO is not writing the file as expected. I have run this code exactly (I wrote it!) and it works perfectly for me. But I'm stumped as to why it would fail for OP. – David Zemens May 28 '14 at 20:55
  • @David As you can see I'm stumped as well! I have started to think it might be an issue with system settings, but I haven't had any luck pinpointing the problem. I did follow your steps in the previous post multiple times but to no avail. I also added a link to my original post where you wrote the code initially. – 114 May 28 '14 at 21:09
  • Not sure if I mentioned previously, but curious to know what happens if, immediately before you do the `fso.CreateTextFile(...` you put in a `MsgBox DOM.XML` (or `Debug.Print DOM.XML` to view in the immediate pane of the VBE). I *think* (without looking through old comments) that you were able to traverse the `DOM` object in the Locals pane, and it appeared to be structured correctly, but just want to make sure. – David Zemens May 29 '14 at 19:19
  • Alternatively, try a different method of printing the file? http://msdn.microsoft.com/en-us/library/office/gg264163(v=office.15).aspx – David Zemens May 29 '14 at 19:29
  • @David That's correct. I just gave that a try now and it came up correctly structured. The OpenTextFile method resulted in an error, unfortunately. – 114 May 29 '14 at 19:44
  • OK so the Debug.Print (or MsgBox) shows the XML correctly... just covering all the bases what about `fso.CreateTextFile(newFilePath, True, False).Write DOM.XML`? – David Zemens May 29 '14 at 20:54
  • Also curious to know what error with the `Open .. As` statement? – David Zemens May 29 '14 at 20:55
  • @David It just says "unable to write to FILEPATH/FILE", though this makes sense to me because I initially had it set up to create a new file in the same folder. With respect to the new fso.CreateTextFile statement (and this is going to sound ridiculous) it actually solved the problem! It seems silly that it was something so small. Do you happen to know why that would have made all the difference? – 114 May 29 '14 at 21:34
  • the difference being the third argument in the `CreateTextFile` method specifies whether to create the file as Unicode (`True`) or ASCII (`False`). I am actually surprised that worked -- character-encoding is not my expertise but it does seem likely to be an encoding issue, so I suggested it only as the first (and easiest) option to implement (there are some more options using the `StrConv` function, too, which would have been my next step.) So that actually worked? If so, I will add it to my answer, below. – David Zemens May 30 '14 at 00:13
  • @David It did, I'm now trying to get the same procedure working for the larger XML file, but ran into the error that I mentioned in the update to my post. I'm going to check over my modifications to make sure that I didn't make any typos. Would it make a difference if the file is set up so that there is an additional root element under which multiple tags (and everything that comes after as in the original example) are placed? – 114 May 30 '14 at 13:58
  • @David I am now able to get the code to work in a file with two tag blocks, but it still won't work in the main file (which contains many tag blocks. It's the same error as before on the line Set itemColl = DOM.DocumentElement.getElementsByTagName(Transaction). That said, I get an error when I check whether the XML is valid. It states that "End tag 'Transaction' (same as Item) does not match the start tag 'DESCRIPTION'. It refers to . Do you happen to know why this would come up? – 114 May 30 '14 at 15:45
  • At this point, your original question appears to be solved and you are now simply trying to debug problems in your inputs. You're getting a parse error because you have a malformed XML file. you need to fix your file and ensure it is well-formed. That message seems to indicate you have an opening DESCRIPTION tag that is not properly closed. – David Zemens May 30 '14 at 15:53
  • @David You're right, I'll ask a separate question about this. – 114 May 30 '14 at 17:13
  • Ultimately the problem now is that the files you're actually using appear to be malformed, which will raise a parse error in DOM. Now you can modify your code to deal with the `.parseError` (e.g., to inform you of the condition and stop execution without an "Error") but the code will not be able to *fix* the errors. That will have to happen independently. – David Zemens May 30 '14 at 17:26
  • @David Agreed, I only realized after writing the error code what the error actually meant. – 114 May 30 '14 at 17:37
  • @David I realize there is one more crucial issue with the code, which is a problem with my own explanation of what I wanted the code to do (the initial example was not detailed enough). I have created a third update to discuss the extra loop (for each entityId) that would be required. Should I create a new question for this? – 114 May 30 '14 at 18:55
  • hmmm I'd ordinarily say yes but i think this might be pretty easy. let me do a quick try and if it doesn't work, you can at least use that as the starting point for a new Q. – David Zemens May 30 '14 at 19:01
  • hmmmm seems like it would be easier to structure the XML with the "Item" as a child element of the "entity", like: http://imgur.com/cNxta5I (there is not a good reason I can think of to duplicate the tags in the XML when a structured hierarchy conveys the same relationship -- linking each customer to the items -- but it may be a requirement beyond your control...) – David Zemens May 30 '14 at 19:17
  • @David That may actually be more effective for a few reasons. In reality there are 6 different tags that I will be appending the entityId to. At the moment all I did was repeat the statement 'AppendEntity DOM, "Item", entity' multiple times for different "Tags", but this would result in a lot of loops (from what I can tell) - one tracking each tag. Would there be a way to choose tags (in a similar way to AppendEntity) to go inside the entity tag as you suggested? – 114 May 30 '14 at 19:25
  • Ahh yes probably... For now I did the "simple" solution which uses the original XML structure (no longer uses the `AppendEntity` function, but if you will have multiple actions to perform then probably using another function to re-organize the XML based on specific tags/etc. would be the best way to go. I'd start with this and see how far that takes you, if you get stuck just go for a new question! – David Zemens May 30 '14 at 19:35

1 Answers1

3

I'm putting this here initially as an answer so I can show my code legibly. Will delete if this also fails. Try this syntax to use the alternative method of writing the file. Notepadd++ tells me this is ANSII:

'## Create an FSO to write the new file'
Set fso = CreateObject("Scripting.FileSystemObject")


Dim FF As Integer
FF = FreeFile
'## Attempt to write the new/modified XML to file'
fso.CreateTextFile newFilePath
Open newFilePath For Output As FF
Print #FF, dom.XML
Close #FF

Alternatively

(again, just covering the bases, and will update or remove if needed)

Try:

fso.CreateTextFile(newFilePath, True, False).Write DOM.XML

The difference being the third argument in the CreateTextFile method specifies whether to create the file as Unicode (True) or ASCII (False).

Notepad++ confirms this method is ANSII, whereas if I do True to create Unicode file, I get a UCS-2 Little Endian file.

I personally notice no difference between either Ascii/Unicode -- I can open both in Notepad or Notepad++ and they appear the same to me, but since this seems like it could be a character-encoding issue, it is worth a shot. I suggested it only as the first (and easiest) option to implement (there are some more options to explore if needed).

Update #3

To address the nested nature of the file... basically you have XML element siblings ("entity" and "Item"), and you need to modify the "Item" (and it's child nodes) to include the "entityId" (which is a child of "entity"). I'm explaining this relationship so that hopefully this modification makes sense!

'##### NO LONGER USED:'
'# Get the entityID node'
'Set Customer = DOM.DocumentElement.getElementsByTagName("CustomerId")(0)'

Dim itm As IXMLDOMNode

'# Instead of getting the first item like we did before, we can iterate the collection'
' of nodes with the entityID tag like so:'
For Each Customer In DOM.DocumentElement.getElementsByTagName("entityId")
   'Since Item is Entity nextSibling, and Entity is parent of entityId,'
   ' we can iterate the collection if its childNodes like this:'
    For Each itm In Customer.ParentNode.NextSibling.ChildNodes
        If itm.HasChildNodes Then
            '# Insert this node before the first child node of Item'
            itm.InsertBefore Customer.CloneNode(True), itm.FirstChild
        Else
            '# Append this node to the Item'
            itm.appendChild Customer.CloneNode(True)
        End If
    Next
Next

'##### This function call is no longer needed
'AppendCustomer DOM, "Transaction", Customer'

This produces XML like:

<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <Results>
        <Reference>{REFERENCE-HERE}</Reference>
        <FillerTags>Filler</FillerTags>
        <entity>
            <entityName>ABC</entityName>
            <entityId>012345</entityId>
        </entity>
        <Items>
            <Item>
                <entityId>012345</entityId>
                <FillerTagsAgain>Filler1</FillerTagsAgain>
                <FillerTagsAgain>Filler1</FillerTagsAgain>
                <FillerTagsAgain>Filler1</FillerTagsAgain>
            </Item>
            <AnotherItem>
                <entityId>012345</entityId>
                <FillerTagsAgain>Filler2</FillerTagsAgain>
                <FillerTagsAgain>Filler2</FillerTagsAgain>
                <FillerTagsAgain>Filler2</FillerTagsAgain>
            </AnotherItem>
        </Items>
    </Results>
    <Results>
        <Reference>{REFERENCE-HERE}</Reference>
        <FillerTags>Filler</FillerTags>
        <entity>
            <entityName>DEF</entityName>
            <entityId>54321</entityId>
        </entity>
        <Items>
            <Item>
                <entityId>54321</entityId>
                <FillerTagsAgain>Filler1</FillerTagsAgain>
                <FillerTagsAgain>Filler1</FillerTagsAgain>
                <FillerTagsAgain>Filler1</FillerTagsAgain>
            </Item>
            <AnotherItem>
                <entityId>54321</entityId>
                <FillerTagsAgain>Filler2</FillerTagsAgain>
                <FillerTagsAgain>Filler2</FillerTagsAgain>
                <FillerTagsAgain>Filler2</FillerTagsAgain>
            </AnotherItem>
        </Items>
    </Results>
</root>
David Zemens
  • 53,033
  • 11
  • 81
  • 130