2

I am looking for an intelligent efficient XSLT which does convert an XML document to CSV data. It should take care of all possible elements in the child nodes. For example, the XML looks like this

<?xml version="1.0" encoding="ISO-8859-1"?>
<sObjects>
   <sObject>
     <Name>Raagu</Name>
     <BillingStreet>Hoskote</BillingStreet>
   </sObject>
   <sObject>
      <Name>Rajath</Name>
      <BillingStreet>BTM</BillingStreet>
      <age>25</age>
   </sObject>
   <sObject>
      <Name>Sarath</Name>
      <BillingStreet>Murgesh</BillingStreet>
      <location>Bangalore</location>
   </sObject>
</sObjects>

And my out put CSV should look like this

Name,BillingStreet,age,location
Raagu,Hoskote,,
Rajath,BTM,25,
Sarath,Murgesh,,Bangalore

All the rows should have fields for all the keys in the CSV even though if the XML does have a value for it.

Following is the XSLT code I came up with by looking at different examples over here.

This is the XSLT I have come up with

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>
    <xsl:variable name="delimiter" select="','"/>

    <xsl:key name="field" match="sObject/*" use="name()"/>

    <xsl:template match="/">

        <xsl:for-each select="/*/*/*[generate-id()=generate-id(key('field', name())[1])]">
            <xsl:value-of select="name()"/>

            <xsl:if test="position() != last()">
                <xsl:value-of select="$delimiter"/>
            </xsl:if>
         </xsl:for-each>

        <xsl:text>&#xa;</xsl:text>

        <xsl:for-each select="/*/sObject">

            <xsl:variable name="property" select="." />
            <xsl:for-each select="$property/*">

                <xsl:variable name="value" select="." />
                <xsl:value-of select="$value"/>
                <xsl:if test="position() != last()">
                    <xsl:value-of select="$delimiter"/>
                </xsl:if>
                <xsl:if test="position() = last()">
                    <xsl:text>&#xa;</xsl:text>
                </xsl:if>

             </xsl:for-each>

        </xsl:for-each>


     </xsl:template>
 </xsl:stylesheet>

and it print this out put

Name,BillingStreet,age,location
Raagu,Hoskote
Rajath,BTM,25
Sarath,Murgesh,Bangalore

But I wanted all the rows should contain values for those many times for all the keys on the first row.

Could you please help me achieve this using XSLT code ?

Raghavendra Nilekani
  • 396
  • 2
  • 10
  • 22
  • 2
    We will help. When you get stuck. Try something, and ask if something is not clear. [FAQ] – ppeterka Mar 05 '13 at 14:24
  • 1
    Possible duplicate of several similar questions. See links under "Related". – mzjn Mar 05 '13 at 14:27
  • Thanks ppeterka. Since I am new to XSLT, I have been trying different things on XSLT to get this done. I have written sample XSL code to get the first row (all the keys)but not getting logic to get the values with proper positions. I have seen related links and did not find solution to my problem. I am stuck at that point now and need kind help from Stackoverflow now. – Raghavendra Nilekani Mar 05 '13 at 14:27
  • So show us what you have tried, so we can understand what you're having problems with – ChrisW Mar 05 '13 at 14:28
  • Also, can you use XSLT2.0? – Eero Helenius Mar 05 '13 at 14:30
  • My prerequisite is make this work in XSLT 1.0. – Raghavendra Nilekani Mar 05 '13 at 14:34

1 Answers1

7

How about this for a two-step solution

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>
    <xsl:variable name="delimiter" select="','"/>

    <xsl:key name="field" match="/*/*/*" use="local-name()"/>

    <!-- variable containing the first occurrence of each field -->
    <xsl:variable name="allFields"
         select="/*/*/*[generate-id()=generate-id(key('field', local-name())[1])]" />

    <xsl:template match="/">
        <xsl:for-each select="$allFields">
            <xsl:value-of select="local-name()" />
            <xsl:if test="position() &lt; last()">
                <xsl:value-of select="$delimiter" />
            </xsl:if>
        </xsl:for-each>
        <xsl:text>&#10;</xsl:text>
        <xsl:apply-templates select="*/*" />
    </xsl:template>

    <xsl:template match="*">
        <xsl:variable name="this" select="." />
        <xsl:for-each select="$allFields">
            <xsl:value-of select="$this/*[local-name() = local-name(current())]" />
            <xsl:if test="position() &lt; last()">
                <xsl:value-of select="$delimiter" />
            </xsl:if>
        </xsl:for-each>
        <xsl:text>&#10;</xsl:text>
    </xsl:template>
</xsl:stylesheet>

The trick here is that the allFields variable will contain one element with each name, so it's this list of nodes that we iterate over for each row, not simply the elements that actually exist in that row. Since you say that you want to support XML in arbitrary namespaces etc. I've used patterns like /*/*/* rather than hard-coding any particular element names (/*/*/* simply matches any element that is a grandchild of the document element, regardless of element names), and I'm using local-name() instead of name() to ignore any namespace prefixes (it would treat <sObject>, <sObject xmlns="foo"> and <f:sObject xmlns:f="foo"> exactly the same).

Ian Roberts
  • 120,891
  • 16
  • 170
  • 183
  • Hi Ian This works really cool. Thanks for the great solution. I was parallelly going though the some of the questions on 'array kind of structure' here in stackoverflow and it will lead to O(N) time complexity. I wanted to understand the time complexity for the generation of "allFields" variable from you. Once we generate the "allFields" variable, I hope the time complexity would be o(n) where 'n' is the number of fields in the allFields. Need your help here. Also I am trying to make this XSLT more generic where what if namespace comes in the XML, for example – Raghavendra Nilekani Mar 05 '13 at 15:25
  • I send XML like this raagu How do we handle that ? – Raghavendra Nilekani Mar 05 '13 at 15:29
  • @RaghavendraNilekani, Handle that by asking separate SO questions and by reading SO questions and answers -- similar questions have been asked thousands of times and most have good answers. Also, it is a good etiquette to *accept* the answer (click the check-mark next to this answer) if it "works really cool". – Dimitre Novatchev Mar 05 '13 at 15:34
  • @RaghavendraNilekani I've made some changes so it's no longer sensitive to any particular element names or namespaces. – Ian Roberts Mar 05 '13 at 15:36
  • Thanks Ian. I found the answer by having a namespace defined within XSLT. But thanks for the great help. Thanks to stackoverflow. – Raghavendra Nilekani Mar 06 '13 at 06:26