2

I want to transform

    <entry>
        <parent1>
            <object_id>1580</object_id>
        </parent1>
        <parent1>
            <object_id>1586</object_id>
        </parent1>
        <parent2>
            <object_id>1582</object_id>
        </parent2>
        <parent2>
            <object_id>1592</object_id>
        </parent2>
    </entry>

into

    <entry>
        <parent1>1580-1586</parent1>
        <parent2>1582-1592</parent2>
    </entry>

Top-level entry name is unknown. Parent names are unknown, and the number of parent nodes with the same name can vary. Child nodes are known "object_id".

So, I would like to group the unknown parents in an abstract way, and concatenate child node values, delimited by "-".

Merge XML nodes using XSLT comes close to answering the question, as does Group/merge childs of same nodes in xml/xslt , but they're not quite what I need.

So far I have:

    <xsl:key name="groupName" match="*[object_id]" use="."/>
    <xsl:template match="*[generate-id(.) = generate-id(key('groupName', .))]">
        <xsl:copy>
        <xsl:call-template name="join"> 
                <xsl:with-param name="list" select="object_id" /> 
                <xsl:with-param name="separator" select="'-'" />                                             
        </xsl:call-template>
        </xsl:copy> 
    </xsl:template>

    <xsl:template name="join"> 
    <xsl:param name="list" /> 
    <xsl:param name="separator"/>     
    <xsl:for-each select="$list"> 
      <xsl:value-of select="." /> 
      <xsl:if test="position() != last()"> 
        <xsl:value-of select="$separator" />         
      </xsl:if> 
    </xsl:for-each> 
    </xsl:template>

Thanks in advance!

Community
  • 1
  • 1
gbentley
  • 35
  • 5
  • The Use attribute of your key needs to be the parent name, not the object_id text. This is what you are trying to group on: parent name. – Sean B. Durkin Aug 27 '12 at 00:22
  • Doesn't the match/use combo achieve that? Match = all nodes with a child node of 'object_id'; Use = 'the node itself'. Or do I need to use 'name()'? – gbentley Aug 27 '12 at 00:48
  • No. Use="." results in the value of the key being the string value of the matched node, **not** the name of the node. Yes, you need to use either name() or local-name(), depending on your data. – Sean B. Durkin Aug 27 '12 at 01:11

2 Answers2

1

Here is a slightly different solution, developed before I noticed Dimitre's post.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*" />  

<xsl:key name="kParents" match="*[object_id]" use="local-name()" />

<xsl:template match="@*|node()">
 <xsl:copy>
   <xsl:apply-templates select="@*|node()" />
 </xsl:copy>
</xsl:template>

<xsl:template match="*[*/object_id]">
  <xsl:variable name="grandparent-id" select="generate-id()" /> 
 <xsl:copy>
   <xsl:apply-templates select="@* | node()[not(object_id)] |
    *[generate-id()=
      generate-id(
        key('kParents',local-name())[generate-id(..)=$grandparent-id][1])]"
      mode="group-head" />
 </xsl:copy>
</xsl:template>

<xsl:template match="*[object_id]" mode="group-head">
 <xsl:variable name="grandparent-id" select="generate-id(..)" /> 
 <xsl:copy>
   <xsl:apply-templates select="@* | node()[not(self::object_id)]" />
   <xsl:for-each select="key('kParents',local-name())[generate-id(..)=$grandparent-id]/object_id">
     <xsl:value-of select="." />
     <xsl:if test="position() != last()"> - </xsl:if>  
   </xsl:for-each>  
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

Update

I updated the style-sheet to reflect the OP's comment about '-' being a delimiter, rather that a separator between first and last values.

Sean B. Durkin
  • 12,659
  • 1
  • 36
  • 65
0

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kObjByValAndParent" match="object_id"
  use="name(..)"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="/*/*"/>

 <xsl:template priority="2" match=
 "/*/*[generate-id(object_id)
      =
       generate-id(key('kObjByValAndParent',name())[1])
      ]
 ">
   <xsl:copy>
     <xsl:value-of select=
     "concat(object_id, ' - ',
             key('kObjByValAndParent',name())[last()]
            )
     "/>
   </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

<entry>
    <parent1>
        <object_id>1580</object_id>
    </parent1>
    <parent1>
        <object_id>1586</object_id>
    </parent1>
    <parent2>
        <object_id>1582</object_id>
    </parent2>
    <parent2>
        <object_id>1592</object_id>
    </parent2>
</entry>

produces the wanted, correct result:

<entry>
   <parent1>1580 - 1586</parent1>
   <parent2>1582 - 1592</parent2>
</entry>

Explanation:

  1. Proper use and overriding of the identity rule.

  2. Proper use of the Muenchian grouping method.


II. In case all values must be concatenated together, use this slightly modified solution:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kObjByValAndParent" match="object_id"
  use="name(..)"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="/*/*"/>

 <xsl:template priority="2" match=
 "/*/*[generate-id(object_id)
      =
       generate-id(key('kObjByValAndParent',name())[1])
      ]
 ">
   <xsl:copy>
     <xsl:for-each select="key('kObjByValAndParent',name())">
      <xsl:if test="not(position()=1)"> - </xsl:if>
      <xsl:value-of select="."/>
     </xsl:for-each>
   </xsl:copy>
 </xsl:template>
</xsl:stylesheet>
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • The concat() looks like it only takes 2 parents with the same name - what about a variable number of parents? – gbentley Aug 27 '12 at 01:15
  • @gbentley, You are confused -- a node has at most one parent. – Dimitre Novatchev Aug 27 '12 at 01:34
  • Sorry, I'm not explaining myself correctly. In my question I mentioned that "Parent names are unknown, and the NUMBER of parent nodes with the same name can vary" - so concatenating a variable number of object_id values, rather than just 2. – gbentley Aug 27 '12 at 01:38
  • @gbentley, I interpret x-y as an interval from x to y. If this is a delimiter and you want all values, delimited by it, just let me know and I'll immediately modify the solution -- this is trivial. – Dimitre Novatchev Aug 27 '12 at 01:42
  • Yes, sorry, I want the values delimited by '-'. Thanks! I'll reword the original question. – gbentley Aug 27 '12 at 01:44
  • @gbentley, You are welcome. Do note, that I updated my question (at the end) with the exact transformation you were after. – Dimitre Novatchev Aug 27 '12 at 02:15