2

My (simplified) input XML looks like this:

<?xml version="1.0" encoding="utf-8"?>
<root>
  <recordList>

    <record>
      <id>16</id>

      <MaterialGroup>
        <material>
          <term>metal, glass</term>
        </material>
        <material.notes />
        <material.part>body</material.part>
      </MaterialGroup>

      <MaterialGroup>
        <material>
          <term>wood</term>
        </material>
        <material.notes>fragile</material.notes>
        <material.part>lid</material.part>
      </MaterialGroup>
    </record>

    <record>
      ...
    </record>

  </recordList>
</root>

Note that term may contain a comma-separated list of multiple materials (metal, glass).

Desired output:

I want to split the material/term and need to duplicate the grandparent Material with all attributes and nodes for that.

<?xml version="1.0" encoding="utf-8"?>
...
      <MaterialGroup>
        <material>
          <term>metal</term>
        </material>
        <material.notes />
        <material.part>body</material.part>
      </MaterialGroup>

      <MaterialGroup>
        <material>
          <term>glass</term>
        </material>
        <material.notes />
        <material.part>body</material.part>
      </MaterialGroup>

      <MaterialGroup>
        <material>
          <term>wood</term>
        </material>
        <material.notes>fragile</material.notes>
        <material.part>lid</material.part>
      </MaterialGroup>
    </record>
...

The first MaterialGroup is copied for every token in the delimited grandchild element material/term, and the term text is set to the token text. material.parts and material.notes can be copied unchanged.

My stylesheet:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
  <xsl:output method="xml" indent="yes"/>
  <xsl:variable name="separator" select="','"/>

  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>


  <xsl:template match="material/term" mode="s">
    <xsl:param name="split_term"/>
    <xsl:value-of select="$split_term"/>
  </xsl:template>


  <xsl:template match="MaterialGroup" name="tokenize">
    <xsl:param name="text" select="material/term"/>

    <xsl:choose>
      <xsl:when test="not(contains($text, $separator))">
        <xsl:copy>
          <xsl:apply-templates/>
          <xsl:apply-templates select="material/term" mode="s">
            <xsl:with-param name="split_term">
              <xsl:value-of select="normalize-space($text)"/>
            </xsl:with-param>
          </xsl:apply-templates>

        </xsl:copy>

      </xsl:when>

      <xsl:otherwise>
        <xsl:copy>

          <xsl:apply-templates/>
          <xsl:apply-templates select="material/term" mode="s">
            <xsl:with-param name="split_term">
              <xsl:value-of select="normalize-space(substring-before($text, $separator))"/>
            </xsl:with-param>
          </xsl:apply-templates>

        </xsl:copy>

        <xsl:call-template name="tokenize">
          <xsl:with-param name="text" select="substring-after($text, $separator)"/>
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>

  </xsl:template>

</xsl:stylesheet>

Actual output:

<?xml version="1.0" encoding="utf-8"?>
<root>
  <recordList>

    <record>
      <id>16</id>

      <MaterialGroup>
        <material>
          <term>metal, glass</term>
        </material>
        <material.notes />
        <material.part>body</material.part>
        metal
      </MaterialGroup>

      <MaterialGroup>
        <material>
          <term>metal, glass</term>
        </material>
        <material.notes />
        <material.part>body</material.part>
        glass
      </MaterialGroup>

      <MaterialGroup>
        <material>
          <term>wood</term>
        </material>
        <material.notes>fragile</material.notes>
        <material.part>lid</material.part>
        wood
      </MaterialGroup>
    </record>

    <record>
      ...
    </record>
  </recordList>
</root>

The tokens (metal, glass) occur as text elements as MaterialGroup children, below material.parts. The text element where it should actually appear (material/term) is unchanged.

I looked at couple solutions to similar problems, but no success:

https://stackoverflow.com/a/5480198/2044940
https://stackoverflow.com/a/10430719/2044940
http://codesequoia.wordpress.com/2012/02/15/xslt-example-add-a-new-node-to-elements/
...

Any ideas?


Edit: Solution by Martin, without modes as suggested by michael:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
  <xsl:output method="xml" indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:param name="separator" select="', '"/>

  <xsl:template match="@* | node()">
    <xsl:param name="term"/>
    <xsl:copy>
      <xsl:apply-templates select="@* | node()">
        <xsl:with-param name="term" select="$term"/>
      </xsl:apply-templates>
    </xsl:copy>
  </xsl:template> 


  <xsl:template match="material/term">
    <xsl:param name="term"/>
    <xsl:copy>
      <xsl:value-of select="$term"/>
    </xsl:copy>
  </xsl:template>


  <xsl:template match="MaterialGroup" name="tokenize">
    <xsl:param name="text" select="material/term"/>

    <xsl:choose>
      <xsl:when test="not(contains($text, $separator))">
        <xsl:copy>
          <xsl:apply-templates>
            <xsl:with-param name="term" select="$text"/>
          </xsl:apply-templates>
        </xsl:copy> 
      </xsl:when>

      <xsl:otherwise> 
        <xsl:copy>
          <xsl:apply-templates>
            <xsl:with-param name="term" select="substring-before($text, $separator)"/>
          </xsl:apply-templates>
        </xsl:copy> 

        <xsl:call-template name="tokenize">
          <xsl:with-param name="text" select="substring-after($text, $separator)"/>
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>

  </xsl:template>

</xsl:stylesheet>
Community
  • 1
  • 1
CodeManX
  • 11,159
  • 5
  • 49
  • 70

1 Answers1

4

I think you need to pass your term around:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
  <xsl:output method="xml" indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:param name="separator" select="', '"/>

  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="@* | node()" mode="s">
    <xsl:param name="term"/>
    <xsl:copy>
      <xsl:apply-templates select="@* | node()" mode="s">
        <xsl:with-param name="term" select="$term"/>
      </xsl:apply-templates>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="material/term" mode="s">
    <xsl:param name="term"/>
    <xsl:copy>
      <xsl:value-of select="$term"/>
    </xsl:copy>
  </xsl:template>


  <xsl:template match="MaterialGroup" name="tokenize">
    <xsl:param name="text" select="material/term"/>

    <xsl:choose>
      <xsl:when test="not(contains($text, $separator))">
        <xsl:copy>
          <xsl:apply-templates mode="s">
            <xsl:with-param name="term" select="$text"/>
          </xsl:apply-templates>
        </xsl:copy>

      </xsl:when>

      <xsl:otherwise>

        <xsl:copy>
          <xsl:apply-templates mode="s">
            <xsl:with-param name="term" select="substring-before($text, $separator)"/>
          </xsl:apply-templates>
        </xsl:copy>


        <xsl:call-template name="tokenize">
          <xsl:with-param name="text" select="substring-after($text, $separator)"/>
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>

  </xsl:template>

</xsl:stylesheet>

That way, with your input, I get

<root>
   <recordList>
      <record>
         <id>16</id>
         <MaterialGroup>
            <material>
               <term>metal</term>
            </material>
            <material.notes/>
            <material.part>body</material.part>
         </MaterialGroup>
         <MaterialGroup>
            <material>
               <term>glass</term>
            </material>
            <material.notes/>
            <material.part>body</material.part>
         </MaterialGroup>
         <MaterialGroup>
            <material>
               <term>wood</term>
            </material>
            <material.notes>fragile</material.notes>
            <material.part>lid</material.part>
         </MaterialGroup>
      </record>
      <record>
      ...
    </record>
   </recordList>
</root>
Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • Awesome! Your solution with identity rule + mode works perfect. Thanks for this quick answer! I'll do a debug-run to see the resolution order step by step. – CodeManX Jan 06 '14 at 18:08
  • +1, nice idea. Just curious: couldn't you add a parameter to the default identity template and do away with the mode? – michael.hor257k Jan 07 '14 at 01:28
  • @michael.hor257k, yes, you are right about the mode, I started with the posted code using a mode and changed into a working solution but looking at the complete code now, you are right that the mode is not needed if we put the parameter into the identity transformation template of the default mode/no mode. – Martin Honnen Jan 07 '14 at 09:29
  • Cool, that is even better! I edited my question to include the final solution with the new mode-less code as suggested by michael. – CodeManX Jan 07 '14 at 11:47
  • @CoDEmanX It is cool indeed. I would name the parameter of the identity template "passthru" or similar, since it is a **general** device to pass data to downstream elements. This technique could be utilized in other ways, for example: assemble the path to each element by having each element append its name to the "passthru" parameter. – michael.hor257k Jan 07 '14 at 12:27
  • Good thought, and I actually renamed `term` locally to the more generic `token`. `passthru` emphasizes what actually happens with the parameter across multiple templates. Would like to keep `term` here however, so it's consistent in all code snippets. – CodeManX Jan 07 '14 at 15:09