36

I needed to use XSL to generate simple plain text output from XML. Since I didn't find any good, concise example online, I decided to post my solution here. Any links referring to a better example would of course be appreciated:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" >
    <xsl:output method="text" omit-xml-declaration="yes" indent="no"/>
    <xsl:template match="/">
        <xsl:for-each select="script/command" xml:space="preserve">at -f <xsl:value-of select="username"/> <xsl:value-of select="startTime/@hours"/>:<xsl:value-of select="startTime/@minutes"/> <xsl:value-of select="startDate"/><xsl:text>
</xsl:text></xsl:for-each> 
    </xsl:template>
</xsl:stylesheet>

A few important things that helped me out here:

  1. the use of xsl:output to omit the standard declaration at the beginning of the output document
  2. the use of the xml:space="preserve" attribute to preserve any whitespace I wrote within the xsl:for-each tag. This also required me to write all code within the for-each tag, including that tag as well, on a single line (with the exception of the line break).
  3. the use of to insert a line break - again I had to omit standard xml indenting here.

The resulting and desired output for this xslt was:

at -f alluser 23:58 17.4.2010
at -f ggroup67 7:58 28.4.2010
at -f ggroup70 15:58 18.4.2010
at -f alluser 23:58 18.4.2010
at -f ggroup61 7:58 22.9.2010
at -f ggroup60 23:58 21.9.2010
at -f alluser 3:58 22.9.2010

As I said, any suggestions of how to do this more elegantly would be appreciated.


FOLLOW-UP 2011-05-08:

Here's the type of xml I am treating:

<script xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="script.xsd">
    <command>
        <username>alluser</username>
        <startTime minutes="58" hours="23"/>
        <startDate>17.4.2010</startDate>
    </command>
</script>
Chris
  • 4,212
  • 5
  • 37
  • 52
  • 1
    You could save on the number of `` elements by using `concat('at -f ', username, ' ', startTime/@hours, ' ', ...)`. Besides, you could wrap your source code – if you do that inside the tags, it won't affect the output. – Christopher Creutzig May 06 '11 at 08:50
  • Good question, +1. See my answer for a complete, very short and really generic solution. – Dimitre Novatchev May 06 '11 at 13:25
  • @Christopher Creutzig: Thanks for the great suggestion on concat(). What are you referring to with "wrap your source code"? – Chris May 08 '11 at 16:30
  • see Mads answer: There's no need to put everything onto one big line. (Although I would not break the line *before* the comma. It just looks weird and does not add anything, not even being able to comment something out more easily.) – Christopher Creutzig May 09 '11 at 06:42
  • We don't do code reviews on Stack Overflow. I would suggest you reframe your question so it's an actual question (e.g. how to I strip the text out of *this* XML document), then post your draft effort as an answer. – Duncan Jones Jan 28 '18 at 06:32

2 Answers2

29
  • You can define a template to match on script/command and eliminate the xsl:for-each
  • concat() can be used to shorten the expression and save you from explicitly inserting so many <xsl:text> and <xsl:value-of> elements.
  • The use of an entity reference &#xA; for the carriage return, rather than relying on preserving the line-break between your <xsl:text> element is a bit more safe, since code formatting won't mess up your line breaks. Also, for me, it reads as an explicit line-break and is easier to understand the intent.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
xmlns:fo="http://www.w3.org/1999/XSL/Format" >
    <xsl:output method="text" omit-xml-declaration="yes" indent="no"/>

    <xsl:template match="script/command">
        <xsl:value-of select="concat('at -f '
                    ,username
                    ,' '
                    ,startTime/@hours
                    ,':'
                    ,startTime/@minutes
                    ,' '
                    ,startDate
                    ,'&#xA;')"/>
    </xsl:template>

</xsl:stylesheet>
Mads Hansen
  • 63,927
  • 12
  • 112
  • 147
  • Thanks Mads, excellent suggestions. This is exactly what I was looking for. I had forgotten about the useful features of XPath 2... How is it that gives me a new line on windows, when windows usually requires not only a line feed, but also a carriage return? – Chris May 08 '11 at 17:15
  • 1
    @Chris Dickinson Do note, this is an XSLT/XPath 1.0 solution, no XPath 2.0 features used. ` ` (Line Feed) is often enough. You can add ` ` (Carriage Return) if you need CRLF. – Mads Hansen May 08 '11 at 18:28
9

Just for fun: this can be done in a very general and compact way:

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:output method="text"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="*">
        <xsl:apply-templates select="node()|@*"/>
        <xsl:text> </xsl:text>
    </xsl:template>

    <xsl:template match="username">
       at -f <xsl:apply-templates select="*|@*"/>
    </xsl:template>
</xsl:stylesheet>

when applied on this XML document:

<script>
 <command>
  <username>John</username>
  <startTime hours="09:" minutes="33"/>
  <startDate>05/05/2011</startDate>

  <username>Kate</username>
  <startTime hours="09:" minutes="33"/>
  <startDate>05/05/2011</startDate>

  <username>Peter</username>
  <startTime hours="09:" minutes="33"/>
  <startDate>05/05/2011</startDate>
 </command>
</script>

the wanted, correct result is produced:

   at -f 09:33 05/05/2011 
   at -f 09:33 05/05/2011 
   at -f 09:33 05/05/2011  

Note: This genaral approach is best applicable if all the data to be output is contained in text nodes -- not in attributes.

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • The @* values are missing(and were supposed to be delimited by ':'). Also, not sure whether the leading spaces before 'at -f' in the output would be a problem. – Mads Hansen May 06 '11 at 14:22
  • @Mads Hansen: Thanks for noting this. Fixed now. – Dimitre Novatchev May 06 '11 at 14:36
  • Almost, but I don't think the source XML has ':' in the value of `@hours`. The sample XSL posted is explicitly putting ':' in, not selecting from the attribute value. – Mads Hansen May 06 '11 at 15:07
  • @Mads Hansen: Sure. While I said "for fun", my answer points out a generic method of designing the XML so that the same general and trivial XSLT transformation can be used to generate the output, not needing to know any additional details. As I said in my answer, I wouldn't use attributes and would store the data only in text nodes. – Dimitre Novatchev May 06 '11 at 16:00
  • Tip for [LibXML2](http://xmlsoft.org/) users (PHP, Python, browsers, etc.): if you not use `` it not strip spaces (!). – Peter Krauss Feb 25 '14 at 05:37