1

I have an xml of the form:

<Set>
   <Element name="Superset1_Set1_Element1"/>
   <Element name="Superset1_Set1_Element2"/>
   <Element name="Superset1_Set2_Element1"/>
   <Element name="Superset2_Set1_Element1"/>
   <Element name="Superset2_Set2_Element1"/>
</Set>

I wish to transform it to an xml of the form:

<Superset name="Superset1">
   <Set name="Set1">
       <Element name="Element1"/>
       <Element name="Element2"/>
   </Set>
   <Set name="Set2">
       <Element name="Element1"/>
   </Set>
</Superset>
<Superset name="Superset2">
   <Set name="Set1">
       <Element name="Element1"/>
   </Set>
   <Set name="Set2">
       <Element name="Element1"/>
   </Set>
</Superset>

How can this be done with XSLT?

Thanks a lot in advance!

Yaneeve
  • 4,751
  • 10
  • 49
  • 87

2 Answers2

5

This can be solved with the following XSLT 1.0 transformation:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:output method="xml" indent="yes" />

  <!-- this key selects elements by their "Superset" name -->
  <xsl:key name="kElementBySuperset" match="Element" use="
    substring-before(@name, '_')" 
  />

  <!-- this key selects elements by their "Superset_Set" name -->
  <xsl:key name="kElementBySet" match="Element" use="
    concat(
      substring-before(@name, '_'), 
      '_',
      substring-before(substring-after(@name, '_'), '_')
    )
  " />

  <!--- initalize output (note the template modes) -->
  <xsl:template match="Set">
    <xsl:apply-templates select="Element" mode="Superset">
      <xsl:sort select="@name" />
    </xsl:apply-templates>
  </xsl:template>

  <!-- output <Superset> elements, grouped by name -->
  <xsl:template match="Element" mode="Superset">
    <xsl:variable name="vSupersetName" select="
      substring-before(@name, '_')
    " />

    <xsl:if test="
      generate-id() 
      = 
      generate-id(key('kElementBySuperset', $vSupersetName)[1])
    ">
      <Superset name="{$vSupersetName}">
        <xsl:apply-templates 
          select="key('kElementBySuperset', $vSupersetName)" 
          mode="Set"
        >
          <xsl:sort select="@name" />
        </xsl:apply-templates>
      </Superset>
    </xsl:if>
  </xsl:template>

  <!-- output <Set> elements, grouped by name -->
  <xsl:template match="Element" mode="Set">
    <xsl:variable name="vSetName" select="
      concat(
        substring-before(@name, '_'), 
        '_',
        substring-before(substring-after(@name, '_'), '_')
      )"
    />

    <xsl:if test="
      generate-id() 
      = 
      generate-id(key('kElementBySet', $vSetName)[1])
    ">
      <Set name="{substring-after($vSetName, '_')}">
        <xsl:apply-templates 
          select="key('kElementBySet', $vSetName)" 
          mode="Element"
        >
          <xsl:sort select="@name" />
        </xsl:apply-templates>
      </Set>
    </xsl:if>
  </xsl:template>

  <!-- output <Element> elements -->
  <xsl:template match="Element" mode="Element">
    <xsl:variable name="vElementName" select="
      substring-after(
        substring-after(@name, '_'), 
        '_'
      )
    " />

    <Element name="{$vElementName}" />
  </xsl:template>

</xsl:stylesheet>

Output on my system when applied to your input document:

<Superset name="Superset1">
  <Set name="Set1">
    <Element name="Element1" />
    <Element name="Element2" />
  </Set>
  <Set name="Set2">
    <Element name="Element1" />
  </Set>
</Superset>
<Superset name="Superset2">
  <Set name="Set1">
    <Element name="Element1" />
  </Set>
  <Set name="Set2">
    <Element name="Element1" />
  </Set>
</Superset>

It is worth noting that this solution is case-sensitive. I assume that is desirable (or at least not harmful) in your case. If case-insensitivity is required, then sprinkling a handful of these would become necessary (where "…" must of course be replaced by the missing letters):

translate($anyvalue, 'ABC…XYZ', 'abc…xyz')

I avoided that because it is very repetitive and makes the solution (even more) obscure.

Further reading: One of my solutions that does a similar two-step grouping using two <xsl:key>s is here:

XSLT 3-level grouping on attributes

It is a bit more verbose on the internals, and it contains a lengthy explanation of <xsl:key> that I'd like to avoid repeating here. ;-)

Community
  • 1
  • 1
Tomalak
  • 332,285
  • 67
  • 532
  • 628
  • Side note: Though this *looks* a lot more complicated than @annakata's solution, it is in fact absolutely the same thing. It just uses separate templates in place of nested calls. And a lot of line breaks that are not strictly necessary. ;-) – Tomalak Jul 01 '09 at 09:49
  • There is a little compile error in this. change 'kElementBySuperset' to 'kElementBySup' and then this will work fine. – Aamir Jul 01 '09 at 09:51
  • Yes, just seen and fixed. This is what you get when you edit the code here *after* you pasted the tested version from your text editor... – Tomalak Jul 01 '09 at 09:54
  • +1 see, I knew it could be done with templates and I did say I was tired :D – annakata Jul 01 '09 at 09:57
  • Hey - I don't think your solution should be deleted! It is a lot shorter than this version, which is clearly one of the advantages of over separate templates. – Tomalak Jul 01 '09 at 10:00
  • @annakata... you shouldn't have deleted your reply. That was another good thing to learn for us XSLT newbies. @Tomalak: +1 for an excellent solution – Aamir Jul 01 '09 at 10:01
  • Happy now? :P (the "//" alone in the for-each solution scares me, and brevity is never a great reason for anything) – annakata Jul 01 '09 at 10:01
  • The "//" is easily fixed - just call key() with the current name part. – Tomalak Jul 01 '09 at 10:03
  • Thanks guys for your in-depth solution proposition. Since my case is much more complicated than the example given I have not yet decided which of your versions I should use. Time will tell I suppose... – Yaneeve Jul 01 '09 at 10:47
  • How much more complicated is it? And why didn't you come up with the real case from the start? :-) – Tomalak Jul 01 '09 at 11:16
  • Ultra complicated... This is just the first problem of many. I didn't come up with the real case for two reasons. The first was because I had wanted to have this as a reference for people with likewise problems. The second because my other problems are of a different nature all together and I didn't want to mix and match... – Yaneeve Jul 01 '09 at 11:36
  • So *this part* of your compound problem has been solved? You sounded a bit like it is also more complicated than you indicated. – Tomalak Jul 01 '09 at 11:52
  • This part has not been solved yet, though I've got some ideas, due to the fact that unlike my example where superset, set and element do not have underscores within them, my real supersets, sets and elements do. For example Superset1 could be homeward_bound whereas a set could be type_of_home and element could be loft. In this case my current element name would be "homeward_bound_type_of_home_loft". What would you suggest? Currently I think I will replace my concrete superset and set names with virtual Superset1,2,etc... Set1,2,etc.. and then run your algorithm. What do you say? – Yaneeve Jul 02 '09 at 13:01
  • This sounds like you're in knee deep sh*t here. ;-) Whoever is responsible for that XML should be hurt a lot. If you have nothing to split on that would work with substring-before() etc., then you have nothing to build an on. Which means you must take a two-step approach, transforming your XML to something more digestible first (using recursion and a list of known, fixed set names), and then you can apply grouping to it. The grouping part should then also get a bit easier, since you can cut out all the substring calculations. – Tomalak Jul 02 '09 at 13:45
  • It's actually waist high ;-) Can I take the two-step approach within the same xslt or must I create two xslts and run the second one on the output of the first? (As you can see I am kind of an XSLT newbie) – Yaneeve Jul 02 '09 at 15:17
  • Question continues in: http://stackoverflow.com/questions/1075400/how-to-create-subsets-of-a-single-set-of-elements-where-element-names-are-comple – Yaneeve Jul 02 '09 at 16:47
3

Preserved for reference, but I strongly suggest the use of templates (i.e. Tomalak's solution) where possible for readability alone...


Certainly possible, but actually harder than I anticipated because of the second order sets and the double underscores - the following could certainly be improved if the "name" values were of a friendlier format.

<xsl:key name="supers" match="Set/Element" use="substring-before(@name,'_')"/>
<xsl:key name="sets" match="Set/Element" use="concat(substring-before(@name,'_'),'_',substring-before(substring-after(@name,'_'),'_'))"/>

<xsl:template match="/">
    <xsl:for-each select="Set/Element[generate-id() = generate-id(key('supers',substring-before(@name,'_'))[1])]">      
        <xsl:variable name="super" select="substring-before(@name,'_')"/>
            <Superset name="{$super}">          
            <xsl:for-each select="//Set/Element[generate-id() = generate-id(key('sets',concat($super,'_',substring-before(substring-after(@name,'_'),'_')))[1])]">
            <Set name="{substring-before(substring-after(@name,'_'),'_')}">
                <xsl:variable name="set" select="concat($super,'_',substring-before(substring-after(@name,'_'),'_'))"/>
                <xsl:for-each select="//Set/Element[starts-with(@name,$set)]">
                    <Element name="{substring-after(substring-after(@name,'_'),'_')}"/>
                </xsl:for-each>
            </Set>
            </xsl:for-each>
            </Superset>
    </xsl:for-each>
</xsl:template>

The trick is just muenchian grouping and capturing the right key values.

It's really not pretty, so I'm sure there's a better solution available but I'm jetlagged :P

annakata
  • 74,572
  • 17
  • 113
  • 180
  • +1 - Though I would recommend replacing the "//Set/Element" expressions with the equivalent calls of key() - that's what you have it for, after all. :) – Tomalak Jul 01 '09 at 09:46
  • Thanks guys for your in-depth solution proposition. Since my case is much more complicated than the example given I have not yet decided which of your versions I should use. Time will tell I suppose... – Yaneeve Jul 01 '09 at 10:46
  • I can't seem to validate this xslt fragment even when I wrap it up with the xsl:stylesheet element... – Yaneeve Jul 01 '09 at 12:11
  • WFM - what's the problem you're seeing? – annakata Jul 01 '09 at 12:18
  • Sorry :( My Bad... It works now, I must have copied it wrong off this page. – Yaneeve Jul 01 '09 at 13:28