1

I have some non-heirarchal xml that has pseudo-structure. Every object declares a parent (except the "root" object) and zero or more children, but does so using ids and reference attributes rather than a nested structure. I would like to convert this to a nested structure.

<document>
    <object id="6" children="12,15"/>
    <object id="12" parent="6" children="13,18"/>
    <object id="13" parent="12" children="14,16,17"/>
    <object id="14" parent="13"/>
    <object id="15" parent="6" children="21,22"/>
    <object id="16" parent="13"/>
    <object id="17" parent="13"/>
    <object id="18" parent="12" children="23,25"/>
    <object id="19" parent="23"/>
    <object id="21" parent="15"/>
    <object id="22" parent="15"/>
    <object id="23" parent="18" children="19,24"/>
    <object id="24" parent="23"/>
    <object id="25" parent="18"/>
</document>

For the record, the actual document also contains object definitions, which the objects also reference in an attribute, similar to a class but I need to retrieve the element name from the definition by, again, reference id. At some point in the process I convert the names of each "object" to "template" or "subsection". If it simplifies things I can perform this operation after applying the structure. I also have a tokenize "function" for the children attribute, as I am using XSLT 1.0, which doesn't have it built-in.

So for the example above I would like this output:

<document>
    <object id="6">
        <object id="12">
            <object id="13">
                <object id="14"/>
                <object id="16"/>
                <object id="17"/>
            </object>
            <object id="18">
                <object id="23">
                    <object id="19"/>
                    <object id="24"/>
                </object>
                <object id="25"/>
            </object>
        </object>
        <object id="15">
            <object id="21"/>
            <object id="22"/>
        </object>
    </object>
</document>

Please keep in mind that these object elements contain other information, attributes, data, etc. These have been removed to simplify the example, but may add a layer of complexity to the problem.

If possible I would like to do this in an elegant and extensible way. I am not forced to but would prefer to use XSL 1.0 (so that it can be integrated with the existing server software).

Thank you kindly to anyone who can help me or point me in the right direction!

kcstrong
  • 46
  • 3
  • This question is very ambiguous: 1). No complete XML document provided (just a minimal but complete document is neecessary); 2). There is both `` and `` -- it seems to me that only one name needs to be generated, or else there should be an explanation when to generate `topic` and when to generate `item`. Please, edit the question and provide the missing data/explanation. – Dimitre Novatchev Apr 03 '12 at 03:37
  • OK Dimitre, I have respectfully reposed my question. Now my question to you is can you help, or were you simply policing the board? – kcstrong Apr 03 '12 at 14:20
  • kcstrong: If I am not interested to help, why would I waste my time asking for clarification? I am starting to work in a few minutes -- will probably have time for your question in 10 hrs from now. I had a not bad solution before I came to the ambiguities which stopped me from publishing this solution. – Dimitre Novatchev Apr 03 '12 at 14:30
  • Thank you, your help is very much appreciated. I am finishing up one other task and then I will implement your solution. – kcstrong Apr 03 '12 at 19:16

2 Answers2

0

Without doing the full XSLT, you could structure your transform like below: Basically, the template for Books would call an apply-templates for chapters, and the template for chapters would apply-templates for topics, etc. The key here, is putting the id from the parent into a variable, so that you can use it in subsequent apply-template calls to find the children.

<document>
   <xsl:apply-templates select="/document/book" />
</document>

<xsl:template match="/document/book">
   <xsl:variable name="bookid">
      <xsl:value-of select="@id"/>
   </xsl:variable>
   <xsl:element name="book">
      <xsl:attribute name="id">
         <xsl:value-of select="@id"/>
      </xsl:attribute>
      <xsl:apply-templates select="/document/chapter[@parent=$bookid]" />
   </xsl:element>
</xsl:template>

<xsl:template match="/document/chapter">
   Template for chapter would be replicated from the book template above
   .
   .
   .
</xsl:template>
javram
  • 2,635
  • 1
  • 13
  • 18
  • Thank you for your response. It looks like this is the only way to go. I had tried and failed but was hoping there was still some combination of methods that would allow me to recursively nest elements to nth level without specifically declaring what they should be. This would prevent me from having to change the code if someone decided to change one of the elements names and allow me to reuse the code to process similar documents from the same system. Cheers! – kcstrong Apr 03 '12 at 12:59
  • It looks like you totally changed the question now, so my original answer doesn't make sense anymore. I would advise you though, that if you are worried about changes to the source document, you should consider publishing a schema. That way you can validate against the schema before running your transformation, and you can ensure that the results are uniform as long as the input document meets the predefined schema. Even a generic solution can become broken if the source document is altered enough. – javram Apr 03 '12 at 17:11
0

This short and simple, complete transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:key name="kChildren" match="object" use="@parent"/>

 <xsl:template match="/*">
   <document>
     <xsl:apply-templates select="*[not(@parent)]"/>
   </document>
 </xsl:template>

 <xsl:template match="object">
  <object id="{@id}">
    <xsl:apply-templates select="key('kChildren', @id)"/>
  </object>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

<document>
    <object id="6" children="12,15"/>
    <object id="12" parent="6" children="13,18"/>
    <object id="13" parent="12" children="14,16,17"/>
    <object id="14" parent="13"/>
    <object id="15" parent="6" children="21,22"/>
    <object id="16" parent="13"/>
    <object id="17" parent="13"/>
    <object id="18" parent="12" children="23,25"/>
    <object id="19" parent="23"/>
    <object id="21" parent="15"/>
    <object id="22" parent="15"/>
    <object id="23" parent="18" children="19,24"/>
    <object id="24" parent="23"/>
    <object id="25" parent="18"/>
</document>

produces the wanted, correct result:

<document>
    <object id="6">
        <object id="12">
            <object id="13">
                <object id="14"/>
                <object id="16"/>
                <object id="17"/>
            </object>
            <object id="18">
                <object id="23">
                    <object id="19"/>
                    <object id="24"/>
                </object>
                <object id="25"/>
            </object>
        </object>
        <object id="15">
            <object id="21"/>
            <object id="22"/>
        </object>
    </object>
</document>

Explanation: Proper use of keys.

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431