0

If I have this file: input file1.xml:

<schema>
    <sequence> 
        <nodeA id="a">
            <fruit id="small">
                <orange id="x" method="create">                    
                    <attributes>
                        <color>Orange</color>
                        <year>2000</year>
                    </attributes>
                </orange>                           
            </fruit>
            <fruit id="small">
                <apple id="x" method="create">                    
                    <attributes>
                        <color>Orange</color>
                        <year>2000</year>
                    </attributes>
                </apple>                           
            </fruit>
            <fruit id="medium">
                <orange id="x" method="create">                    
                    <attributes>
                        <color>Orange</color>
                        <year>2000</year>
                    </attributes>
                </orange>                           
            </fruit>
        </nodeA>
        <nodeB id="b">
            <dog id="large">
                <doberman id="x" method="create">
                    <condition>
                        <color>Black</color>
                    </condition>
                </doberman>
            </dog>
        </nodeB>
    </sequence>
</schema>

file2.xml:

<schema>
    <sequence>
        <nodeA id="a">
            <fruit id="small">
                <melon id="x" method="create">
                    <attributes>
                        <color>Orange</color>
                        <year>2000</year>
                    </attributes>
                </melon>
            </fruit>
        </nodeA>
        <nodeB id="b">
            <dog id="small">
                <poodle id="x" method="create">                    
                    <condition>
                        <color>White</color>
                    </condition>
                </poodle>  
            </dog>                
        </nodeB>
    </sequence>
</schema>

After concatenation: output: concate.xml

<schema>
    <sequence>
        <nodeA id="a">
            <fruit id="small">
                <orange id="x" method="create">                    
                    <attributes>
                        <color>Orange</color>
                        <year>2000</year>
                    </attributes>
                </orange>                        
            </fruit>
            <fruit id="small">
                <apple id="x" method="create">                    
                    <attributes>
                        <color>Orange</color>
                        <year>2000</year>
                    </attributes>
                </apple>                           
            </fruit>
            <fruit id="medium">
                <orange id="x" method="create">                    
                    <attributes>
                        <color>Orange</color>
                        <year>2000</year>
                    </attributes>
                </orange>                           
            </fruit>
            <fruit id="small">
                <melon id="x" method="create">
                    <attributes>
                        <color>Orange</color>
                        <year>2000</year>
                    </attributes>
                </melon>
            </fruit>
        </nodeA>
        <nodeB id="b">
            <dog id="large">
                <doberman id="x" method="create">
                    <condition>
                        <color>Black</color>
                    </condition>
                </doberman>
            </dog>
            <dog id="small">
                <poodle id="x" method="create">                    
                    <condition>
                        <color>White</color>
                    </condition>
                </poodle>  
            </dog>                
        </nodeB>
    </sequence>
</schema>

For the concate it will depend on the file order so the node in file2.xml will be placed under the node of file1.xml (as seen on the example). And I have up to 5 files. How is this achievable using xsl transformation only, i.e the xslt will input 5 files at the same time and outputting 1 file?

This is the document structure and the point where we do merge:

<schema>
    <sequence> 
        <nodeA id="a">
            <fruit id="small">
                <orange id="x" method="create">                    
                    ...
                </orange>                
            </fruit>
            <fruit id="small">
                ...                          
            </fruit>
            <fruit id="large"> 
                ...                          
            </fruit>

            <!-- we merge below this -->
        </nodeA>

        <nodeB id="b">
            <dog id="large">
                <doberman id="x" method="create">
                    ...
                </doberman>
            </dog>
            <dog id="small">
                <doberman id="x" method="create">
                    ...
                </doberman>
            </dog>
        <!-- we merge below this -->
        </nodeB>

        <somenode id="any">
            ...    
        </somenode>
    </sequence>
</schema>

Note: If not possible concatenating only two files input will be fine as it can always be repeated for the other files. Also there are various node name in the file (nodeA, nodeB, SomeNode, etc.)so something that can generalize this problem is needed.

we can use xsl1.0 or 2.0.

Thanks very much. John

John
  • 177
  • 3
  • 16
  • 1
    Can we see the XSLT you already have? You're showing the input and the output, but not the code that is exhibiting the problem. – Merlyn Morgan-Graham May 08 '12 at 02:29
  • @MerlynMorgan-Graham I'm using merge solution from [here](http://www2.informatik.hu-berlin.de/~obecker/XSLT/merge/merge.xslt.html). But that solutions did not account for node-order.Thanks. – John May 08 '12 at 04:51

3 Answers3

3

@John, here's a more generic solution:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:a="http://a.com">
   <xsl:strip-space elements="*" />
   <xsl:output indent="yes" method="xml" />

   <xsl:variable name="to-merge" select="document('input2.xml') | document('input3.xml')"/>

   <xsl:function name="a:id">
      <xsl:param name="ctx"/>
      <xsl:value-of select="concat($ctx/local-name(), $ctx/@id)"/>
   </xsl:function>   

   <xsl:key name="match" match="/schema/sequence/*" use="a:id(.)"/>

   <xsl:template match="@* | node()">
      <xsl:copy>
         <xsl:apply-templates select="@* | node()"/>
      </xsl:copy>
   </xsl:template>

   <xsl:template match="*[count(. | key('match', a:id(.))) = count(key('match', a:id(.)))]">
    <xsl:copy>
           <xsl:apply-templates select="@* | node()"/>

           <xsl:variable name="id" select="a:id(.)"/>
           <xsl:for-each select="$to-merge">
              <xsl:apply-templates select="key('match', $id)/*"/>
           </xsl:for-each>
    </xsl:copy>
   </xsl:template>

</xsl:stylesheet>

You define the merge point in the key and you define the merge match function in the a:id. You can fallback to XSLT 1.0 by just taking the a:id function into your predicates.

My assumptions:

  • you run the transformation on the "leading" document and sequence your merges in that to-merge variable
  • you have a single match point that is located at the same spot in each document to be merged. it shouldn't be hard to customize the solution to merge from different points in each document.
  • the nodes match by local-name() and @id
Pavel Veller
  • 6,085
  • 1
  • 26
  • 24
1

Here is another answer. This joins at the schema/sequence/* level, rather than just nodeA and NodeB.

 <?xml version="1.0"?>
 <xsl:stylesheet 
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   xmlns:fn="http://www.w3.org/2005/xpath-functions"
   xmlns:xs="http://www.w3.org/2001/XMLSchema"
   version="2.0"
   exclude-result-prefixes="xsl xs fn">

 <xsl:output indent="yes" encoding="UTF-8" />
 <xsl:param name="file2" /> <!-- input file1.xml -->
 <xsl:variable name="file1-doc" select="root()" />
 <xsl:variable name="file2-doc" select="document($file2)" />


 <xsl:template  match="/">
  <schema>
   <sequence>
    <xsl:call-template name="union-point">
     <xsl:with-param name="value" select="schema/sequence/*"/>
    </xsl:call-template>
    <xsl:call-template name="union-point">
     <!-- The following predicate excludes all the node names that we
             have already processed in the first call-template.    -->
     <xsl:with-param name="value" select="$file2-doc/schema/sequence/*
      [not (fn:exists($file1-doc/schema/sequence/name()))]
      "/>
    </xsl:call-template>
   </sequence>
  </schema>
 </xsl:template>

 <xsl:template name="union-point">
   <xsl:param name="value"/>
   <xsl:for-each select="$value/name()" >
    <xsl:variable name="node-name" select="."/>
    <xsl:element name="{.}">
 <xsl:attribute name="id">
  <xsl:value-of select="($file1-doc/schema/sequence/*[name()=$node-name]/@id |
                         $file2-doc/schema/sequence/*[name()=$node-name]/@id  )[1]" />
 </xsl:attribute>
     <xsl:apply-templates select="$file1-doc/schema/sequence/*[name()=$node-name]/*" />
     <xsl:apply-templates select="$file2-doc/schema/sequence/*[name()=$node-name]/*" />
    </xsl:element>
   </xsl:for-each>
 </xsl:template>

 <xsl:template match="element()">
   <xsl:copy>
     <xsl:apply-templates select="@*,node()"/>
    </xsl:copy>
 </xsl:template>

 <xsl:template match="attribute()|text()|comment()|processing-instruction()">
   <xsl:copy/>
 </xsl:template>

 </xsl:stylesheet>

As a solution, its probably a bit clumsy and awkward, but it basically works. Hopefully an expert like Dimitre Novatchev will come along and offer a tidier alternative. This is about the limits of my ability.

*UPDATE 1 * I added the id attribute to the etc.

UPDATE 2 Here is the resultant output:

 <?xml version="1.0" encoding="UTF-8"?>
 <schema>
    <sequence>
       <nodeA id="a">
          <fruit id="small">
                 <orange id="x" method="create">                    
                     <attributes>
                         <color>Orange</color>
                         <year>2000</year>
                     </attributes>
                 </orange>                           
             </fruit>
               <fruit id="small">
                 <apple id="x" method="create">                    
                     <attributes>
                         <color>Orange</color>
                         <year>2000</year>
                     </attributes>
                 </apple>                           
             </fruit>
          <fruit id="medium">
                 <orange id="x" method="create">                    
                     <attributes>
                         <color>Orange</color>
                         <year>2000</year>
                     </attributes>
                 </orange>                           
             </fruit>
          <fruit id="small">
                 <melon id="x" method="create">
                     <attributes>
                         <color>Orange</color>
                         <year>2000</year>
                     </attributes>
                 </melon>
             </fruit>
       </nodeA>
       <nodeB id="b">
          <dog id="large">
                 <doberman id="x" method="create">
                     <condition>
                         <color>Black</color>
                     </condition>
                 </doberman>
             </dog>
          <dog id="small">
                 <poodle id="x" method="create">                    
                     <condition>
                         <color>White</color>
                     </condition>
                 </poodle>  
             </dog>
       </nodeB>
    </sequence>
 </schema>
Sean B. Durkin
  • 12,659
  • 1
  • 36
  • 65
  • hi, after looking at the result from your solution I think you forget attribute details on the . Your result is `` instead of `` Is it possible to include all the attribute on that node? Thanks. – John May 09 '12 at 01:58
  • Yep. Your right. I will adjust the style-sheet to add the attributes. I suppose you want the attributes from both sources merged? Eg. What if File1's id for nodeB is different to that of File2? – Sean B. Durkin May 09 '12 at 03:13
  • because we merged based on the nodeB/nodeX-id, if we cannot find the same id for nodeB/nodeX on file1, we simply add the whole chunk of that nodeB/nodeX to file 1 at the very bottom. – John May 09 '12 at 06:58
  • I added the id attribute to . It gets its value from file1, unless the node or attribute never was in file1, in which case it gets its value from file2. This style-sheet does not copy all the attributes of nodeA, just the "id" one. I hope that this is ok. – Sean B. Durkin May 09 '12 at 08:46
  • Hi, It seems that you are forgetting the importance of id in the or . Because the merging is based on the node name and also the id not just the node name. Thanks. – John May 09 '12 at 15:29
  • I don't understand. The resultant already has the id attribute. What am I missing? My resultant output appears the same as your sample output. Any way, try Pavel's solution. If his one works, go with that. – Sean B. Durkin May 09 '12 at 23:40
  • What I mean is that when merging your solution did not account for the node id, for example if I change nodeB to nodeA it didn't work as it will only match the node name (i.e. nodeB or nodeA). So I will use Pavel solution. Anyway I really appreciate what you've done to answer this. Thanks very much. – John May 10 '12 at 01:06
0

Try:

 <?xml version="1.0"?>
 <xsl:stylesheet 
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   xmlns:fn="http://www.w3.org/2005/xpath-functions"
   xmlns:xs="http://www.w3.org/2001/XMLSchema"
   version="2.0"
   exclude-result-prefixes="xsl xs fn">

 <xsl:output indent="yes" encoding="UTF-8" />
 <xsl:param name="file2" /> <!-- input file1.xml -->
 <xsl:variable name="file2-doc" select="document($file2)" />

 <xsl:template  match="/">
  <schema>
   <sequence>
    <nodeA id="a">
     <xsl:apply-templates select="schema/sequence/nodeA/*" />
     <xsl:apply-templates select="$file2-doc/schema/sequence/nodeA/*" />
    </nodeA>
    <nodeB id="b">
     <xsl:apply-templates select="schema/sequence/nodeB/*" />
     <xsl:apply-templates select="$file2-doc/schema/sequence/nodeB/*" />
    </nodeB>
   </sequence>
  </schema>
 </xsl:template>

 <xsl:template match="element()">
   <xsl:copy>
     <xsl:apply-templates select="@*,node()"/>
    </xsl:copy>
 </xsl:template>

 <xsl:template match="attribute()|text()|comment()|processing-instruction()">
   <xsl:copy/>
 </xsl:template>

 </xsl:stylesheet>

Make file1 your main document input. Pass the filename for file2 as parameter "file2". Similarly extend for multiple input files.

Sean B. Durkin
  • 12,659
  • 1
  • 36
  • 65
  • Thanks.. Could you provide solution that is not dependent on the nodeA, nodeB, because the actual file consists of many node with different name. – John May 08 '12 at 04:05
  • In that case, I don't understand your question. What are the rules for the desired transformation? Perhaps you could give a few more (but short) use cases? – Sean B. Durkin May 08 '12 at 04:55
  • your solution is all right, but it depends on the nodeA and nodeB only, what if I have much more arbitrary node name in the file?That is why I will need something that can be applied to different node name. cheers, – John May 08 '12 at 05:05
  • At what level in the document do you want the union to occur at? Is it at the schema/sequence/* level? For example, you only want one node, right? – Sean B. Durkin May 08 '12 at 06:36
  • yes it is at schema/sequence/*. Only one but multiple . I have added the doc structure above. To make it clearer. Thanks so much for your time. – John May 08 '12 at 08:17
  • Please tick my answer, or at least up-arrow it. – Sean B. Durkin May 08 '12 at 08:29