1

I'm working in Excel 2010 using VBA. I have a column in a MS SQL server 2008 database table that contains a single line of XML in it. I'm extracting it and using a 'prettyprint' routine I wrote to make it readable. The issue that I'm running into is that it doesn't take the value I give it in the xslt transformation. As you can see, I've tried quite a few combinations that I've found doing different google searches. Here is the code:

Function PrettyPrintXML(strXML As String, Optional sFileOut As String) As String

Dim xDoc As MSXML2.DOMDocument60
Set xDoc = New MSXML2.DOMDocument60
Dim xDocStyle As MSXML2.DOMDocument60
Set xDocStyle = New MSXML2.DOMDocument60
Dim xDocOut As MSXML2.DOMDocument60
Set xDocOut = New MSXML2.DOMDocument60
Dim strXMLStyleSheet As String

strXMLStyleSheet = "<?xml version=""1.0"" encoding=""utf-8""?>" & vbCrLf
strXMLStyleSheet = strXMLStyleSheet & "<xsl:stylesheet  version=""1.0"" xmlns:xsl=""http://www.w3.org/1999/XSL/Transform""" & vbCrLf
'strXMLStyleSheet = strXMLStyleSheet & " extension -element - prefixes = ""redirect""" & vbCrLf
'strXMLStyleSheet = strXMLStyleSheet & " xmlns:xalan=""http://xml.apache.org/xalan"">" & vbCrLf
strXMLStyleSheet = strXMLStyleSheet & " xmlns:xalan=""http://xml.apache.org/xslt"">" & vbCrLf
'strXMLStyleSheet = strXMLStyleSheet & "<xsl:output method=""xml"" encoding=""UTF-8"" indent=""yes"" xalan:indent-amount=""2""/>" & vbCrLf
strXMLStyleSheet = strXMLStyleSheet & "<xsl:output method=""xml"" indent=""yes"" xalan:indent-amount=""2""/>" & vbCrLf
strXMLStyleSheet = strXMLStyleSheet & "<xsl:strip-space elements=""*""/>" & vbCrLf
strXMLStyleSheet = strXMLStyleSheet & "<xsl:template match=""@* | node()"">" & vbCrLf
strXMLStyleSheet = strXMLStyleSheet & "  <xsl:copy>" & vbCrLf
strXMLStyleSheet = strXMLStyleSheet & "    <xsl:apply-templates select=""@* | node()""/>" & vbCrLf
strXMLStyleSheet = strXMLStyleSheet & "  </xsl:copy>" & vbCrLf
strXMLStyleSheet = strXMLStyleSheet & "</xsl:template>" & vbCrLf & vbCrLf
strXMLStyleSheet = strXMLStyleSheet & "</xsl:stylesheet>"

'strXMLStyleSheet = "<xsl:stylesheet xmlns:xsl=""http://www.w3.org/1999/XSL/Transform""" & vbCrLf
'strXMLStyleSheet = strXMLStyleSheet & "Version = ""1.0""" & vbCrLf
'strXMLStyleSheet = strXMLStyleSheet & "xmlns: redirect = ""http://xml.apache.org/xalan/redirect""" & vbCrLf
'strXMLStyleSheet = strXMLStyleSheet & "extension-element-prefixes = ""redirect""" & vbCrLf
'strXMLStyleSheet = strXMLStyleSheet & "xmlns:xalan=""http://xml.apache.org/xslt"">" & vbCrLf
'strXMLStyleSheet = strXMLStyleSheet & "<xsl:output method=""xml"" indent=""yes"" xalan:indent-amount=""2""/>" & vbCrLf

With xDocStyle
    .async = False
    .validateOnParse = False
    .resolveExternals = False
    DebugPrint strXMLStyleSheet
    .LoadXML strXMLStyleSheet
End With

With xDoc
    .async = False
    .validateOnParse = False
    .resolveExternals = False
    DebugPrint strXML
    .LoadXML strXML
    .transformNodeToObject xDocStyle, xDocOut
End With

PrettyPrintXML = xDocOut.XML

If Not sFileOut = "" Then
    If Not PrettyPrintXML = "" Then
        Dim ffile As String
        ffile = FreeFile
        Open sFileOut For Output As #ffile
            Print #ffile, PrettyPrintXML
        Close #ffile
    Else
        MsgBox "No file was generated. Check your parameters and try again.", vbOKOnly, "Empty File"
    End If
End If

Set xDocOut = Nothing
Set xDocStyle = Nothing
Set xDoc = Nothing

End Function

The xslt transform in the latest version of code is:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet  version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:xalan="http://xml.apache.org/xslt">
<xsl:output method="xml" indent="yes" xalan:indent-amount="2"/>
<xsl:strip-space elements="*"/>
<xsl:template match="@* | node()">
    <xsl:copy>
        <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
</xsl:template>

The interesting part is if I open it in Notepad++, it does indent 2 like I want but it changes the encoding to utf-16. If I open it in Notepad, it also changes the encoding to utf-16 but doesn't change the spacing. I've tried changing it to utf-16 as well as ASCII. No change though. My ultimate goal is to give the user a choice of opening it in Notepad, wordpad, MS Word 2010 or any custom program they want like Ultraedit, textedit, vi, Emacs, etc. I have to get this working first though.

Community
  • 1
  • 1
Matt Williamson
  • 6,947
  • 1
  • 23
  • 36
  • 2
    It seems you are saving document to an open text file. Why not use the [Save Method](https://msdn.microsoft.com/en-us/library/ms753769(v=VS.85).aspx) after transformation? Also in [output](https://msdn.microsoft.com/en-us/library/ms256187(v=vs.110).aspx) tag of xslt, you an specify encoding. – Parfait Sep 10 '15 at 13:33
  • 1
    The spacing attribute you're using appears to be Xalan specific - how would MSXML know what to do with that? – Dan Field Sep 10 '15 at 13:37
  • 1
    And while this is not exactly a duplicate, you can more easily pretty print XML in VBA: http://stackoverflow.com/questions/1118576/how-can-i-pretty-print-xml-source-using-vb6-and-msxml – Dan Field Sep 10 '15 at 13:38
  • @parfait I made the sFileOut optional and only used it for debugging. When I call call the code, I don't use it. I'm very new to xslt so I'm just learning as I go. Thanks for the comments. – Matt Williamson Sep 10 '15 at 13:52

1 Answers1

0

The interesting part is if I open it in Notepad++, it does indent 2 like I want but it changes the encoding to utf-16.

<snip />

<xsl:output method="xml" indent="yes" xalan:indent-amount="2"/>

You are using MSXML with a Xalan extension. That won't work, because MSXML does not know anything of Xalan extensions. The fact that you see indent=2 in Notepad++ is probably because you configured it to indent that way in the editor's settings. In Windows' Notepad you see the correct indent of 4.

Note that from an XML perspective, the encoding or indent does not matter. That matters only in case of humanly reading the file and you prefer another indentation, which you can just as well configure in your XML editor, Visual Studio or (really not an XML editor) Notepad++.

If I open it in Notepad, it also changes the encoding to utf-16 but doesn't change the spacing.

How do you mean "it changes the encoding"? That could only happen if you save the file again.

    Open sFileOut For Output As #ffile
       Print #ffile, PrettyPrintXML
   Close #ffile

Since you are writing the file using non-XML methods, you have little control over its encoding. If you want a certain encoding, write the file directly from the XSLT transform, i.e., use:

<xsl:output encoding="utf-8" />

And have MSXML save it for you, as opposed to writing it to a string and then saving it.

With regards to pretty-printing (if indentation itself is not enough), have a look at this SO question and answer.

Community
  • 1
  • 1
Abel
  • 56,041
  • 24
  • 146
  • 247