3

I am attempting to use MSXML in VB6 to create a XML file that can then be deserialized as an object in C#.

The XML I am attempting to mimic looks like this

<?xml version="1.0" encoding="utf-8"?>
<ArrayOfStock xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Stock>
    <ProductCode>12345</ProductCode>
    <ProductPrice>10.32</ProductPrice>
  </Stock>
  <Stock>
    <ProductCode>45632</ProductCode>
    <ProductPrice>5.43</ProductPrice>
  </Stock>
</ArrayOfStock>

The question I have is how do I create the following line using the MSXML library?

<?xml version="1.0" encoding="utf-8"?>

IE: How do I create an unterminated "header" value?

Maxim Gershkovich
  • 45,951
  • 44
  • 147
  • 243
  • What problem did you have without the processing instruction? Did the XML Serializer throw an exception? Which one? – John Saunders Mar 24 '11 at 01:17
  • 1
    Wouldn't it be easier to call a COM-visible C# library that writes the file? It just needs to create objects and serialise them. Guaranteed to work, and you would never need to even *look* at the XML file yourself. Leave XML to the machines, that's what I say. – MarkJ Mar 24 '11 at 10:57
  • @John - Yes it threw an exception. Something along the lines of invalid XML. (I don't have the code with me at the moment). – Maxim Gershkovich Mar 25 '11 at 14:46
  • @MarkJ - Actually that's a great point (except that I need to distribute it to a number of sites). Still, arguably a better approach. – Maxim Gershkovich Mar 25 '11 at 14:47
  • 1
    I would definitely consider it even with the requirement to distribute. Deploying a COM-visible .Net DLL is [meant](http://stackoverflow.com/questions/273548/net-com-dll-deployment/273570#273570) to be pretty [straightforward](http://stackoverflow.com/questions/1446481/how-to-deploy-a-com) – MarkJ Mar 25 '11 at 14:54
  • Does this answer your question? [How to make XMLDOMDocument include the XML Declaration?](https://stackoverflow.com/questions/1144015/how-to-make-xmldomdocument-include-the-xml-declaration) – ivan_pozdeev Sep 09 '20 at 12:20

4 Answers4

3

Have a look at this similar question.

You need to use a MXXMLWriter60, instead of saving it directly. ... See IMXWriter for details.

Community
  • 1
  • 1
Josh M.
  • 26,437
  • 24
  • 119
  • 200
  • Could you possibly explain why this approach is better then what I have done? – Maxim Gershkovich Mar 25 '11 at 14:44
  • I'm not necessarily saying that it's better or worse - just that if you want to write the XML declaration, I believe this is how you have to do it. Have a look at the link I posted for more information. – Josh M. Mar 25 '11 at 15:07
  • 2
    Oh, you're refering to the answer you posted. Well I believe that is a "hack" since you are manually adding the encoding instead of setting the encoding on the XML document. Characters within the document will not magically be encoded that way so when you read the file back in you can't be sure that it will look the same as it did when you wrote it. That's because when reading it back in the encoding in the processing instruction will be used whether or not your content was actually encoded that way. – Josh M. Mar 25 '11 at 15:11
  • Fair enough. I didn't fully appreciate the implications till now - its finally clicked. – Maxim Gershkovich Mar 26 '11 at 05:32
3

Lots of hubbub here about UTF-8, but the DOMDocument.save() method does use the PI to determine how to encode saved output. The only real snag is that for formatted output instead of economical output (no whitespace) you need to use the SAX Writer.

Basically things seem to work just as expected though. There is nothing hackish about this, it's how it is done.

Option Explicit

Private Sub Main()
    Dim varStock As Variant
    Dim docStock As MSXML2.DOMDocument
    Dim elemRoot As MSXML2.IXMLDOMElement
    Dim elemStock As MSXML2.IXMLDOMElement
    Dim elemField As MSXML2.IXMLDOMElement
    Dim I As Integer
    
    varStock = Array(Array("12345", 10.32), _
                     Array("¥45632", 5.43)) 'Yen sign used here to show Unicode.
    
    Set docStock = New MSXML2.DOMDocument
    With docStock
        .appendChild .createProcessingInstruction("xml", _
                                                  "version=""1.0"" encoding=""utf-8""")
        Set elemRoot = .createElement("ArrayOfStock")
        With elemRoot
            .setAttribute "xmlns:xsi", "http://www.w3.org/2001/XMLSchema-instance"
            .setAttribute "xmlns:xsd", "http://www.w3.org/2001/XMLSchema"
            For I = 0 To UBound(varStock)
                Set elemStock = docStock.createElement("Stock")
                With elemStock
                    Set elemField = docStock.createElement("ProductCode")
                    elemField.Text = CStr(varStock(I)(0))
                    .appendChild elemField
                    Set elemField = docStock.createElement("ProductPrice")
                    elemField.Text = CStr(varStock(I)(1))
                    .appendChild elemField
                End With
                .appendChild elemStock
            Next
        End With
        Set .documentElement = elemRoot
        On Error Resume Next
        Kill "created.xml"
        On Error GoTo 0
        .save "created.xml"
    End With
End Sub

Examining the output file looking for the Yen sign you should see that the text is UTF-8 encoded.

If you want this in-memory rather than to disk you can .save() to something like an ADODB.Stream object, or just use XMLHTTPRequest.send with the DOMDocument as the argument (body). There is no need to resort to the heavyweight option of an Interop approach.

ivan_pozdeev
  • 33,874
  • 19
  • 107
  • 152
Bob77
  • 13,167
  • 1
  • 29
  • 37
1

Its called a "declaration".

On your XML writer, set the property omitXMLDeclaration to False and encoding to "utf-8".

Daniel A. White
  • 187,200
  • 47
  • 362
  • 445
1

Thanks for both your input but unfortunately the methods described only apply to xml in the .NET platform.

(But you did guide me in the right direction)

In VB6 (Using MSXML 3 and above) the method to accomplish what I was looking for was createProcessingInstruction()

The code looks like this.

Private Sub BuildHeader()
    m_document.appendChild m_document.createProcessingInstruction("xml", "version=""1.0"" encoding=""utf-8""")
End Sub

and can then be processed as such (assuming all the other object details are consistant)

XmlSerializer serializer = new XmlSerializer(typeof(Stock));
using (StreamReader streamReader = new StreamReader(path))
{
    return (Stock)serializer.Deserialize(streamReader);
}
Maxim Gershkovich
  • 45,951
  • 44
  • 147
  • 243
  • May not be a good idea. You are forcing the document to be marked as UTF-8. I advise you to make sure the XML really is encoded as UTF-8! Try writing some characters from the "ASCII" range 128-255 and see whether they deserialise properly in the C#. You are in danger of writing an "angle-bracket delimited file" rather than valid XML. An alternative: there *must* be some properties on MSXML that instruct it to write UTF-8 and to write the declaration line. How about http://msdn.microsoft.com/en-us/library/ms764660(v=VS.85).aspx and http://msdn.microsoft.com/en-us/library/ms764660(v=VS.85).aspx – MarkJ Mar 24 '11 at 10:53
  • I am only adding this because the resulting xml from running serializer.Serialize(myObject); includes the same header value. I do not change any default settings for serialization. – Maxim Gershkovich Mar 25 '11 at 14:43
  • When calling serializer.Serialize(); it would surely be. However, valid point. I didn't fully understand what you meant till now. Will need to look into it. – Maxim Gershkovich Mar 26 '11 at 05:30