0

I am XSLT beginner, learning by example and by working on projects. Currently, I am working on creating grouped, nested structure from flat.

Consider this sample xml input:

<root>
    <a>First text</a>
    <b>Text</b>
    <c>More text in c tag</c>
    <d>There is even d tag</d>
    <a>Another "a" test.</a>
    <b>ěščřžýáíéúů</b>
    <b>More b tags</b>
    <c>One followed by c tag</c>
    <a>Last a tag</a>
    <b>This time only with b tag, but this goes on and on</b>
</root>

And this XSLT:

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output omit-xml-declaration="yes" indent="yes" method="xml" encoding="utf-8"/>
  <xsl:strip-space elements="*"/>

  <xsl:output method="xml" encoding="utf-8"/>

  <xsl:key name="groupA" match="b|c|d" use="generate-id(preceding-sibling::a[1])" />
  <xsl:key name="groupB" match="c|d" use="generate-id(preceding-sibling::b[1])"/>

  <xsl:template match="node() | @*">
    <xsl:copy>
      <xsl:apply-templates select="node() | @*"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="root">
    <wrapperTest>
      <xsl:apply-templates/>
    </wrapperTest>
  </xsl:template>

  <xsl:template match="root">
    <xsl:apply-templates select="@*|a"/>
    <xsl:apply-templates select="@*|b"/>
  </xsl:template>

  <xsl:template match="a">
    <xsl:copy>
      <xsl:apply-templates select="key('groupA', generate-id())" />
    </xsl:copy>
  </xsl:template>
  
  <xsl:template match="b">
    <xsl:copy>
      <xsl:apply-templates select="key('groupB', generate-id())" />
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

The expected output is:

<wrapperTest>
    <a>First text
      <b>Text
        <c>More text in c tag</c>
        <d>There is even d tag</d>
      </b>
    </a>
    <a>Another "a" test.
      <b>ěščřžýáíéúů</b>
      <b>More b tags
        <c>One followed by c tag</c>
      </b>
    </a>
    <a>Last a tag
      <b>This time only with b tag, but this goes on and on</b>
    </a>
</wrapperTest>

In the transformation I created are excessive copies created and I have no idea why. I guess that the isuue I am hiting upon is basic in its nature, but I cant figure it out.

The only limit for solution is, that preferably it should be in XSLT 1.0 (since the project is incorporated in python script with lxml). In the edge case, when this couldnt be achived with XSLT 1.0, I can accomodate for recent saxon version which removes any limitations ...

I have already looked at answers here, here and others, but most of them use either XSLT 2.0 or are very complicated for a beginner to knive through.

Final note: Ideally, proposed solution should be extensible in it nature, because the final form of my project should be also grouped by tag <c>, like so:

<wrapperTest>
    <a>First text
      <b>Text
        <c>More text in c tag
          <d>There is even d tag</d>
        </c>
      </b>
    </a>
    <a>Another "a" test.
      <b>ěščřžýáíéúů</b>
      <b>More b tags
        <c>One followed by c tag</c>
      </b>
    </a>
    <a>Last a tag
      <b>This time only with b tag, but this goes on and on</b>
    </a>
</wrapperTest>

Which I will happily do as learning excersize.

Tomáš Kruliš
  • 167
  • 2
  • 9
  • It seems like an advanced text book task for learning XSLT 2/3's `for-each-group group-starting-with` in a recursive function or template. As you say you have the option to use Saxon-C with Python bindings I would opt for that approach instead of fighting with keys or sibling recursion in XSLT 1. As for the current code, having two `xsl:template match="root"` doesn't make sense. – Martin Honnen Sep 03 '20 at 10:27
  • @MartinHonnen Thank you very much for your comment and answer, however I would still prefer to know, or at least see solution in XLST 1.0. Reason being that I am working on several devices and only on one I have access to `saxon` without user control (my own personal device). Sadly, I dont know how to utilize Python bindongs for `saxon`, I have actually `java` platform HE which I run from terminal (when working on device that has access to it). I shouldve mention that I am beginner programmer too ... – Tomáš Kruliš Sep 03 '20 at 11:46

2 Answers2

2

How about:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:key name="b" match="b" use="generate-id(preceding-sibling::a[1])" />
<xsl:key name="cd" match="c|d" use="generate-id(preceding-sibling::b[1])"/>

<!-- identity transform -->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="/root">
    <wrapperTest>
        <xsl:apply-templates select="a"/>
    </wrapperTest>
</xsl:template>

<xsl:template match="a">
    <xsl:copy>
        <xsl:apply-templates/>
        <xsl:apply-templates select="key('b', generate-id())" />
    </xsl:copy>
</xsl:template>
  
<xsl:template match="b">
    <xsl:copy>
        <xsl:apply-templates/>
        <xsl:apply-templates select="key('cd', generate-id())" />
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>
michael.hor257k
  • 113,275
  • 6
  • 33
  • 51
  • Thank you very much for your answer! Would you be kind to elaborate why `` and then cascading `` in specific node-matches works instead of approach I was using (define template aplications in match of the `root` node, at least I thought I was doing that ...) It would be greatly appreciated and hopefully informative for anybody wrestling with classic XSLT 2.0 issue using XSLT 1.0 ... – Tomáš Kruliš Sep 03 '20 at 11:53
  • 1
    You want `a` nodes - and only `a` nodes - to be children of the root `wrapperTest` element - therefore you only apply templates to them at that point. Similarly, when you create an `a` node, you only want the trailing `b` nodes to be its children. --- Note that `` doesn't do anything in this example except copy text nodes. – michael.hor257k Sep 03 '20 at 12:01
1

The recursive XSLT 2/3 for-each-group group-starting-with for all levels comes down to

<xsl:function name="mf:wrap" as="node()*">
    <xsl:param name="input" as="node()*"/>
    <xsl:for-each-group select="$input" group-starting-with="node()[node-name() = node-name($input[1])]">
        <xsl:copy>
            <xsl:sequence select="node(), mf:wrap(tail(current-group()))"/>
        </xsl:copy>
    </xsl:for-each-group>
</xsl:function>
        
<xsl:template match="root">
    <xsl:copy>
        <xsl:sequence select="mf:wrap(*)"/>
    </xsl:copy>
</xsl:template>

https://xsltfiddle.liberty-development.net/gVhEaj8

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110