1

I'm learning to use XSL to parse XML into HTML/XHTML.

The XLST <xsl:for-each> element is a core element of the language that allows looping. However posts here and elsewhere suggest using this is common for beginners (which I am!) and is poor style.

My question is: what are better (as in more efficient / elegant / better style) options to <xsl:for-each> loops and why?

In the example below I used nested <xsl:for-each> and <xsl:choose> elements to loop through the required nodes with a conditional <xsl:when> test. This works okay and selects the nodes I need, but does feel rather clunky...

Your wisdom and insights would be greatly appreciated!

My example XML is a report generated by a Stanford HIVdb database query: https://hivdb.stanford.edu/hivdb/by-sequences/

XSD schema is here: https://hivdb.stanford.edu/DR/schema/sierra.xsd

My example XML report is here: https://github.com/delfair/xml_examples/blob/main/Resistance_1636677016671.xml

My example XSLT:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">

<html>
<head>
    <title>Example Report</title>
</head>
<body>

<h3>Significant mutations</h3>

<xsl:for-each select=".//geneData">
    <xsl:choose>
        <xsl:when test="gene='HIV1PR'">
        Protease inhibitor mutations<br/><br/>
        </xsl:when>
        <xsl:when test="gene='HIV1RT'">
        Reverse transcriptase inhibitor mutations<br/><br/>
        </xsl:when>
        <xsl:when test="gene='HIV1IN'">
        Integrase inhibitor mutations<br/><br/>
        </xsl:when>
    </xsl:choose>
<table>
<xsl:for-each select=".//mutation">
    <xsl:choose>
        <xsl:when test="classification='PI_MAJOR' or classification='PI_MINOR' or classification='NRTI' or classification='NNRTI' or classification='INI_MAJOR' or classification='INI_MINOR'">
        <tr>
        <td>Class</td>
        <td>Mutation</td>
        </tr>
        <tr>
            <td><xsl:value-of select="classification"/></td>
            <td><xsl:value-of select="mutationString"/></td>
        </tr>
        </xsl:when>
    </xsl:choose>
</xsl:for-each>
</table><br/>
</xsl:for-each>

</body>
</html>

</xsl:template>
</xsl:stylesheet>

Resulting HTML:

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Example Report</title>
</head>
<body>
<h3>Significant mutations</h3>
Protease inhibitor mutations<br><br><table></table>
<br>
Reverse transcriptase inhibitor mutations<br><br><table>
<tr>
<td>Class</td>
<td>Mutation</td>
</tr>
<tr>
<td>NNRTI</td>
<td>K103N</td>
</tr>
</table>
<br>
Integrase inhibitor mutations<br><br><table></table>
<br>
</body>
</html>

2 Answers2

4

First of all, xsl:for-each is NOT used for "looping". It has no exit condition and there is no way to pass the result of one iteration to another.

Next, using xsl:for-each is NOT limited to beginners, nor is it "poor style" - despite what you might have read here or anywhere else.

The xsl:for-each instruction is no more than a shortcut used in a special case. The general approach works in two stages:

  • first, you select a set of nodes and tell the processor to apply templates to them;

  • next, the processor finds the template that best matches each node in the selected node-set and applies it.

In the case where (1) you want to apply uniform processing to all nodes in the selected set and (2) there is no need to apply the template recursively or re-use it otherwise, you can simply tell the processor: take these nodes and apply this template to them.

And that's exactly what the xsl:for-each instruction does. It has a select attribute to select the nodes to process and its content is a template to be applied to the selected node-set. Nothing more, nothing less. There is no paradigm shift here. There is no "push style" vs. "pull-style". There is no good vs. evil.

The only problem with xsl:for-each arises when it is the only tool in the stylesheet author's toolbox. As I said, it is a shortcut that can be used in special circumstances. When those circumstances do not apply, using it leads to very poor code.

michael.hor257k
  • 113,275
  • 6
  • 33
  • 51
  • Ah the looping description is from https://www.w3schools.com/xml/xsl_for_each.asp so not mine! – Derek Fairley Nov 19 '21 at 14:00
  • Thanks for your response - improving my code is the objective – Derek Fairley Nov 19 '21 at 14:00
  • @DerekFairley `W3Schools` is just a tutorial site, not a reference. Despite their name, they are not affiliated in any way with the World Wide Web Consortium (W3C). The only normative reference are the XSLT specifications such as https://www.w3.org/TR/1999/REC-xslt-19991116. – michael.hor257k Nov 19 '21 at 15:30
1

I guess what you mean by "advanced style" is using templates (that do pattern matching) instead of xsl:for-each "loops".

The core functionality of your code could be transformed as follows:

...
      <h3>Significant mutations</h3>

      <xsl:apply-templates select=".//geneData" />
      <table>
        <xsl:apply-templates select=".//mutation" />
      </table><br/>
    </body>
  </html>
</xsl:template>
<!-- End of main template matching "/" -->

<xsl:template match="geneData[gene='HIV1PR']">Protease inhibitor mutations<br/><br/></xsl:template>
<xsl:template match="geneData[gene='HIV1RT']">Reverse transcriptase inhibitor mutations<br/><br/></xsl:template>
<xsl:template match="geneData[gene='HIV1IN']">Integrase inhibitor mutations<br/><br/></xsl:template>
<xsl:template match="geneData" />   <!-- Discard all not matched 'geneData' elements -->    

<xsl:template match="mutation[classification='PI_MAJOR' or classification='PI_MINOR' or classification='NRTI' or classification='NNRTI' or classification='INI_MAJOR' or classification='INI_MINOR']">
        <tr>
            <td>Class</td>
            <td>Mutation</td>
        </tr>
        <tr>
            <td><xsl:value-of select="classification"/></td>
            <td><xsl:value-of select="mutationString"/></td>
        </tr>
</xsl:template>
<xsl:template match="mutation" />   <!-- Discard all not matched 'mutation' elements -->    

</xsl:stylesheet>

In the above code both sets of nodes (.//geneData and .//mutation) are passed to xsl:apply-templates which passes the resulting nodes to all the templates. And those who match are executed. Hence the short xsl:template's with the predicates (the [...] parts of the match="...") which replace the xsl:whens of your code.

This is supposedly the "standard" approach of XSLT development. In practice there are use-cases where xsl:for-each may be preferable for code clarity, but generally both approaches are interchangeable.

zx485
  • 28,498
  • 28
  • 50
  • 59
  • 3
    one of the biggest reasons for templates over for-each is extensibility. You can import a given stylesheet and define a new template with the same match criteria that will override the original template behaviors. Think similar to extending a java class and overriding an inherited function. Extracting the for-each logic and moving into a template is similar to refactoring units of work into utility functions, and then if you need to reason about what it does and is much easier to customize and override a smaller unit of work without copy/paste the rest that is unchanged. – Mads Hansen Nov 19 '21 at 12:37
  • That's really useful - thank you. – Derek Fairley Nov 19 '21 at 14:01
  • @zx485 Your example is doing something I'm struggling to understand...! If there is a match for the conditional test it returns the expected values for the "classification" and "mutationString" elements. However when there is no match, it returns value of all 7 child elements of the "mutation" element. What am I missing? – Derek Fairley Nov 19 '21 at 16:22
  • Good point. I haven't tested the code.The problem is that all `` elements are passed by `xsl:apply-templates`, but not all are matched. The ones not matched are processed by the [default templates](https://learn.microsoft.com/en-us/visualstudio/xml-tools/xslt-default-templates?view=vs-2022) of XSLT. The solution is simple: add an empty template for the super-set like this ``. This will match and discard all `` elements that are not matched by more specific template rules. – zx485 Nov 19 '21 at 16:39
  • Great - thanks so much. Learning a lot in a short time here... – Derek Fairley Nov 19 '21 at 16:44
  • @DerekFairley I did a little [write-up on ``](https://stackoverflow.com/a/4478126/18771) once. – Tomalak Nov 20 '21 at 09:18
  • @Tomalak Thanks, very informative! Beginning to understand the power of using templates... You also explain there is no fixed processing order, which is something I'm still trying to resolve in this case. The nice template example from @zx485 above matches and returns the required elements, but I lose the ordered matching of the original example that used nested `````` ... let's call it repetition ;-) – Derek Fairley Nov 21 '21 at 11:26
  • I'm not sure if this is mandatory for XSLT processors, but from my experience `apply-templates` processes nodes in document order. In [this link](https://www.stylusstudio.com/docs/v2007/d_xslt73.html): "By default, the new list of source nodes is processed in document order. However, you can use the xsl:sort instruction to specify that the selected nodes are to be processed in a different order. See xsl:sort. ". Use `xsl:sort` to specify your custom order. – zx485 Nov 22 '21 at 10:58
  • @zx485 I'll experiment with `xsl:sort` - although on reflection my issue is really grouping. Now learning about Muenchian grouping and use of keys to implement the grouping I need with `xsl:apply-templates` rather than my original nested `xsl:for-each` elements... Thank you for your very useful advice :-D – Derek Fairley Nov 22 '21 at 14:14
  • Thank you for accepting. BTW you can achieve some useful sorting by concatenating the sort-strings like this, for example: ``. This example, although probably not useful, combines the first 5 chars of the `gene` element value with the name of the second child element of `geneData` as sorting key. Maybe this example inspires you in some way. – zx485 Nov 22 '21 at 20:07