2

I'm trying to make a simple XML --> CSV script, using XSLT. I found that etree seems to "want" a tag to output... Does anyone know a workaround? Yes, I've seen this post: XML to CSV Using XSLT.

See below...

Here's a sample XML data just for reference. My code doesn't even do anything with the data yet, as it was failing to even write a header.

    <projects>
    <project>
    <name>Shockwave</name> 
    <language>Ruby</language> 
    <owner>Brian May</owner> 
    <state>New</state> 
    <startDate>31/10/2008 0:00:00</startDate> 
    </project>
    <project>
    <name>Other</name> 
    <language>Erlang</language> 
    <owner>Takashi Miike</owner> 
    <state> Canceled </state> 
    <startDate>07/11/2008 0:00:00</startDate> 
    </project>
    </projects>

Here's my script:

    import sys
    from lxml import etree

    system_file = sys.argv[1]
    xml_file = sys.argv[2]

    sys_txt = open( system_file,"r" ).read()
    xsl_txt = open( "csv_file.xslt","r" ).read()


    sysroot = etree.fromstring( sys_txt )
    xslroot = etree.fromstring( xsl_txt )
    transform = etree.XSLT( xslroot )

    with open( xml_file, "w" ) as f:
        f.write(etree.tostring( transform(sysroot) ) )

This XSLT code does NOT work ( etree.tostring... = None ):

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>
    <xsl:template match="/">
    Hi
    </xsl:template>

    </xsl:stylesheet>

But THIS XSLT does work... seems etree needs to output an XML file?

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>
    <xsl:template match="/">
    <dummy>
    Hi
    </dummy>
    </xsl:template>

    </xsl:stylesheet>

At this point I'm thinking I can proceed with a dummy tag, then remove it at end...

Community
  • 1
  • 1

1 Answers1

1

"Python etree XSLT Requires Tag output?"

The answer is NO.

As exemplified in the documentation, section XSLT result objects; you can use standard python str() function to get the expected string representation of the transformation result, especially when it has no root element :

from lxml import etree

raw_xml = '''<projects> 
  <project> 
    <name>Shockwave</name>  
    <language>Ruby</language>  
    <owner>Brian May</owner>  
    <state>New</state>  
    <startDate>31/10/2008 0:00:00</startDate> 
  </project>  
  <project> 
    <name>Other</name>  
    <language>Erlang</language>  
    <owner>Takashi Miike</owner>  
    <state>Canceled</state>  
    <startDate>07/11/2008 0:00:00</startDate> 
  </project> 
</projects>'''
raw_xslt = '''<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">  
  <xsl:output method="text"/>  
  <xsl:template match="/"> 
    <xsl:text>Hi</xsl:text> 
  </xsl:template> 
</xsl:stylesheet>'''

sysroot = etree.fromstring(raw_xml)
xslroot = etree.fromstring(raw_xslt)
transform = etree.XSLT(xslroot)

print str(transform(sysroot))
# output:
# Hi

And as you saw, etree.tostring() is still usable when the transformation result has a root element.

har07
  • 88,338
  • 12
  • 84
  • 137
  • Thank you. That works. I'm curious though, why then does etree need a tostring method? just for pretty printing? – user3700949 Mar 20 '16 at 22:34
  • It is normally used to print XML element/tree, not an XSLT transformation result which can be XML, text, HTML, etc. – har07 Mar 21 '16 at 00:33