0

I am trying to make a csv from an xml file of the form:

    <?xml version="1.0" encoding="UTF-8"?>
    <Envelope>
        <Header>
            <env:MessageSentDateTime>2012-09-19T09:50:04Z</env:MessageSentDateTime>
            <env:MessageSequenceNumber>856432</env:MessageSequenceNumber>
        </Header>
        <Body>
            <Data>
                <Data:ID>
                    <Data:AODB>9346280</Data:AODB>
                    <Data:Ref>
                        <common:Code>HJ</common:Code>
                        <common:num>8113</common:num>
                    </Data:Ref>
                 </Data:ID>
                 ... Continues like this, no set number of nodes, parts or AnotherParts
         </Body>
         Second message starting with <Header> ending with </Body>, 
         will be more than 2 in practice
    </Envelope>

I want to put a newline in the csv file at the /Body tag since this indicates a new message. There will be a mixture of messages so different numbers of nodes, different numbers of parts and no consistent end node in the Body part. In addition, there will be nodes that do not contain any text but I still want a comma there.

So far I have:

    <?xml version="1.0" encoding="utf-8"?>
    <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
        <xsl:output method="text"/>
        <xsl:strip-space elements="*"/>

        <xsl:template match="*[not(*)]">
            <xsl:value-of select="normalize-space(.)"/>
            <xsl:text>,</xsl:text>
        </xsl:template>

        <xsl:template match="Body[last()]">
            <xsl:value-of select="normalize-space(.)"/>
            <xsl:text>&#10;</xsl:text>
        </xsl:template>
    </xsl:stylesheet>

It also adds a comma after the last piece of information in Body. I'd like to replace that comma with a newline, is there a simple way to do this?

Regards, David

DH_CC89
  • 21
  • 1
  • 7
  • Try this: http://stackoverflow.com/questions/3056579/convert-xml-document-to-comma-delimited-csv-file-using-xslt-stylesheet – Milind Thakkar Nov 27 '12 at 10:20

2 Answers2

0

One approach could be to change the template matching the Body element to specifically look for the 'leaf' child elements

<xsl:template match="Body">
   <xsl:apply-templates select=".//*[not(*)]"/>
   <xsl:text>&#10;</xsl:text>
</xsl:template>

Then, in your template that matches the leaf elements, you can change it to only output a comma if it is not the first element it has found

  <xsl:if test="position() &gt; 1">
     <xsl:text>,</xsl:text>
  </xsl:if>

Here is the full XSLT if you want to give it a go:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:output method="text"/>
   <xsl:strip-space elements="*"/>

   <xsl:template match="*[not(*)]">
      <xsl:if test="position() &gt; 1">
         <xsl:text>,</xsl:text>
      </xsl:if>
      <xsl:value-of select="normalize-space(.)"/>
   </xsl:template>

   <xsl:template match="Body">
      <xsl:apply-templates select=".//*[not(*)]"/>
      <xsl:text>&#10;</xsl:text>
   </xsl:template>
</xsl:stylesheet>
Tim C
  • 70,053
  • 14
  • 74
  • 93
  • Hi Tim C, thank you for your reply. This code sorts out the problem with the final comma but it doesn't give me a new line in the output – DH_CC89 Nov 27 '12 at 11:39
  • Can you try replacing ` ` with `NEWLINE` in the XSLT, as that will prove the template matching **Body** is actually being used. Also, it would help a bit if you editted your question to show an complete example of XML with actual data in (Nothing too big though!). Thanks! – Tim C Nov 27 '12 at 12:37
  • I tried replacing with NEWLINE and the template isn't being used. I've edited the question with a little of the actual xml data – DH_CC89 Nov 27 '12 at 13:11
0

My specifications changed slightly, needed the node path as well as the value in order to make each entry unique. This is the code I used to solve my problem:

    <?xml version="1.0" encoding="utf-8"?>
    <xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"></xsl:output>
    <xsl:strip-space elements="*"/>

    <xsl:template match="*[*]">
        <xsl:param name="elementNames" select="''"/>
        <xsl:apply-templates select="*">
            <xsl:with-param name="elementNames">
                <xsl:copy-of select="$elementNames"/>
                <xsl:value-of select="replace(name(.), ':', '.')"/>
                <xsl:text>_</xsl:text>
            </xsl:with-param>
        </xsl:apply-templates>
     </xsl:template>

     <xsl:template match="*[not (*)]">
         <xsl:param name="elementNames" select="''"/>
         <xsl:copy-of select="$elementNames"/>
         <xsl:value-of select="replace(name(.), ':', '.')"/>
         <xsl:value-of select="name()"/>,<xsl:apply-templates select="current()/text()"/>
         <xsl:text>&#10;</xsl:text>
     </xsl:template>

     <xsl:template match="/*">
          <xsl:apply-templates select="*"/>
     </xsl:template> 
     </xsl:stylesheet>

Thank you to everybody who looked and tried to help.

Regards, David

DH_CC89
  • 21
  • 1
  • 7