3

I've been tasked with writing some XSLT 2.0 to translate an XML document to another XML document. I'm relatively new to XSLT but I have learn alot during the days I've do this. During this time I have had to map simple values, i.e. 002 -> TH etc. This has been fine for small lists of less than 10 values, I used xsl:choose. However I need to map over 300 values from one list to another and vice versa. Each list has a value and textual description. The two list values do not always directly map, so I may have to compare textual descriptions and use default values if necessary.

I have two solutions to the problem:

  1. Use xsl:choose: This I think could be slow and possible hard to update if either of the lists changes.

  2. Have a XML document with the relationship between each list item. I would use an XPath expressions to retrieve an associated value: This is my preferred solution because I believe it will be more maintainable and easier to update. Although I'm not sure it is efficient.

What solution should I use, one of my suggestion, or is there a better way to map these values?

Mathias Müller
  • 22,203
  • 13
  • 58
  • 75
  • Could you provide a small example (smallest possible) with the two xml files and the rules to convert from a list to its image? – Dimitre Novatchev Jan 19 '09 at 20:15
  • This is pretty much not to answer if you don't provide an example. BTW: using xsl:choose most definitely is not the best way ...even for smaller lists. – tcurdt Jan 20 '09 at 00:10
  • After reading your problem's description (before even reading your solutions) I thought that I'd tackle the problem via a mapping document, just as you describe in solution 2. – Urs Reupke Jan 19 '09 at 20:05

2 Answers2

4

Here is an XSLT 2.0 solution.

Source XML file:

<input>
  <data>001</data>
  <data>002</data>
  <data>005</data>
</input>

"Mapping" xml file:

<map>
  <default>?-?-?</default>
    <input value="001">RZ</input>
    <input value="002">TH</input>
    <input value="003">SC</input>
</map>

XSLT transformation:

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output omit-xml-declaration="yes" indent="yes"/>

  <xsl:param name="pmapFile" 
       select="'C:/temp/deleteMap.xml'" />

  <xsl:variable name="vMap" 
       select="document($pmapFile)" />

  <xsl:variable name="vDefault" 
       select="$vMap/*/default/text()" />

  <xsl:key name="kInputByVal" match="input" 
   use="@value" />

  <xsl:template match="/*">
    <output>
      <xsl:apply-templates/>
    </output>
  </xsl:template>

  <xsl:template match="data">
    <data>
        <xsl:sequence select= 
         "(key('kInputByVal', ., $vMap)[1]/text(),
           $vDefault
           )[1]
         "/>
    </data> 
  </xsl:template>
</xsl:stylesheet>

Output:

<output>
  <data>RZ</data>
  <data>TH</data>
  <data>?-?-?</data>
</output>

Do note the following:

  1. The use of the document() function to access the "mapping" xml document, which is stored in a separate XML file.

  2. The use of <xsl:key/> and the XSLT 2.0 key() function to determine and access each corresponding output value. The third argument specifies the xml document that must be accessed and indexed.

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • Neat. :-) Is there a fundamental flaw in my approach or is all of it necessary to make it work in XSLT 1.0? My transformation seems so long in comparison. – Tomalak Jan 20 '09 at 19:37
  • @Tomalak: You could eliminate most of the default value processing by: However, for this to work (in XSLT 1) the "default" element must follow all "entry" elements in document order. I will answer this question if you ask it :) – Dimitre Novatchev Jan 20 '09 at 20:01
  • Re: "default" element must be last: I would suspect that in XSLT 1.0, document order takes precedence over "node set concatenation" order, so that "(node[1]|node[2])" and "(node[2]|node[1])" yield an identical node set. – Tomalak Jan 21 '09 at 07:55
  • @Tomalak: The rule that a node-set must be in document order are the same in XSLT 2, too. However, XPath 2 has a new datatype: the sequence, which by definition holds items in any specified order. So, if you compose a sequence by using the "," or "to" operator, you get your desired order. – Dimitre Novatchev Jan 21 '09 at 14:09
  • Re: Non-trivial XSLT. Yes, and if you post an XSLT answer to what they think is a C# only question, they even downvote you. Regardless that your answer solves the problem in a completely superior way to all C# solutions proposed. I am enjoying solving Project Euler now -- in XSLT of course. – Dimitre Novatchev Jan 21 '09 at 14:12
  • Have fun. :-) I have read your blog post on the C# finger tree this morning. Enjoyable read, though way over my head I'm afraid. Too bad only spam bots left comments so far. Where is the answer you got down-voted for? – Tomalak Jan 21 '09 at 14:58
  • OBTW, incorporating your suggestion into the XSLT 1.0 code worked out nicely, see below. What made me think: I used to believe that would create an index for later use. Now that I've learned that evaluation happens not before the call to key() - where does the speed benefit come from? – Tomalak Jan 21 '09 at 15:03
  • @Tomalak: An index is built the 1st time a key() function is evaluated for a key for a document. The speed gain comes if the key() function is used more than once. Certain XSLT processors may combine the initial parsing of a document with building all indexes. This speeds up even the 1st useof key() – Dimitre Novatchev Jan 21 '09 at 17:29
  • @Tomalak: Just try to implement a finger tree from Hinze/Patterson's article and the understanding of it will come gradually. It's fascinating, I agree. Just tell some people that solving certain problems feels better than sex and look at their concerned faces... :) – Dimitre Novatchev Jan 21 '09 at 17:32
  • @Tomalak: It was my answer to this question: http://stackoverflow.com/questions/451950/get-the-xpath-to-an-xelement THe answer is currently deleted, but you have over 10K points and (I heard) can see deleted answers. – Dimitre Novatchev Jan 21 '09 at 17:36
  • I can. I can even undelete them, though I don't think that this is a particularly useful capability. Regarding the finger tree: You would be surprised how far I am from being able to write my own implementation. Even I am surprised at times. ;-) – Tomalak Jan 21 '09 at 18:35
2

Here is a way to do what you intend, using an <xsl:key> and otherwise following your method two.

The sample input file (data.xml):

<?xml version="1.0" encoding="utf-8"?>
<input>
  <data>001</data>
  <data>002</data>
  <data>005</data>
</input>

The sample map file (map.xml):

<?xml version="1.0" encoding="utf-8"?>
<map default="??">
  <entry key="001">RZ</entry>
  <entry key="002">TH</entry>
  <entry key="003">SC</entry>
</map>

The sample XSL stylesheet, explanation follows:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" encoding="utf-8" indent="yes"/>

  <xsl:param name="map-file" select="string('map.xml')" />
  <xsl:variable name="map-doc" select="document($map-file)" />
  <xsl:variable name="default-value" select="$map-doc/map/@default" />
  <xsl:key name="map" match="/map/entry" use="@key" />

  <xsl:template match="/input">
    <output>
      <xsl:apply-templates select="data" />
    </output>
  </xsl:template>

  <xsl:template match="data">
    <xsl:variable name="raw-value" select="." />
    <xsl:variable name="mapped-value">
      <xsl:for-each select="$map-doc">
        <xsl:value-of select="key('map', $raw-value)" />
      </xsl:for-each>
    </xsl:variable>
    <data>
      <xsl:choose>
        <xsl:when test="$mapped-value = ''">
          <xsl:value-of select="$default-value" />
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="$mapped-value" />
        </xsl:otherwise>
      </xsl:choose>
    </data>
  </xsl:template>
</xsl:stylesheet>

What this does is:

  • use document() to open map.xml, saving the resulting node-set to a variable
  • save the default value for further reference
  • prepare an <xsl:key> to work against the "map" node set
  • use <xsl:for-each> not as a loop, but as a means to switch the execution context before calling the key() function - otherwise key() would work against the "data" document and return nothing
  • find the corresponding node with the key() function, save it in a variable
  • check the variable value on output - if it is empty, use the default value
  • repeat (through <xsl:apply-templates>)

The credit for the neat <xsl:for-each> trick goes to Jeni Tennison, who described the technique on the XSL mailing list. Be sure to read the thread.

Output of running the stylesheet against data.xml:

<?xml version="1.0" encoding="utf-8"?>
<output>
  <data>RZ</data>
  <data>TH</data>
  <data>??</data>
</output>

All of this is XSLT 1.0. I'm convinced a better/more elegant version exists that makes use of the advantages XSLT 2.0 offers, but unfortunately I'm not overly familiar with XSLT 2.0. Maybe someone else posts a better solution.


EDIT

Through Dimitre Novatchev's hint in the comments, I was able to create a a considerably shorter and more preferable stylesheet:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" encoding="utf-8" indent="yes"/>

  <xsl:param name="map-file" select="string('map.xml')" />
  <xsl:variable name="map-doc" select="document($map-file)" />
  <xsl:variable name="default" select="$map-doc/map/default[1]" />
  <xsl:key name="map" match="/map/entry" use="@key" />

  <xsl:template match="/input">
    <output>
      <xsl:apply-templates select="data" />
    </output>
  </xsl:template>

  <xsl:template match="data">
    <xsl:variable name="raw-value" select="." />
    <data>
      <xsl:for-each select="$map-doc">
        <xsl:value-of select="(key('map', $raw-value)|$default)[1]" />
      </xsl:for-each>
    </data>
  </xsl:template>
</xsl:stylesheet>

However, this one requires a slightly different map file to work in XSLT 1.0:

<?xml version="1.0" encoding="utf-8"?>
<map>
  <entry key="001">RZ</entry>
  <entry key="002">TH</entry>
  <entry key="003">SC</entry>
  <!-- default entry must be last in document -->
  <default>??</default>
</map>
Community
  • 1
  • 1
Tomalak
  • 332,285
  • 67
  • 532
  • 628
  • Here is an XSLT 2.0 solution :) – Dimitre Novatchev Jan 20 '09 at 14:49
  • @Tomalak: You could eliminate most of the default value processing by: However, for this to work (in XSLT 1) the "default" element must follow all "entry" elements in document order. I will answer this question if you ask it :) – Dimitre Novatchev Jan 20 '09 at 20:02
  • @Tomalak: Apart from this possible refactoring, your solution is a very good one! – Dimitre Novatchev Jan 20 '09 at 20:03
  • Thanks for the hint. :) Factored in - now it looks quite similar to your solution. – Tomalak Jan 21 '09 at 08:08
  • @Tomalak: The rule that a node-set must be in document order are the same in XSLT 2, too. However, XPath 2 has a new datatype: the sequence, which by definition holds items in any specified order. So, if you compose a sequence by using the "," or "to" operator, you get your desired order – Dimitre Novatchev Jan 21 '09 at 14:57