0

As suggested. I am breaking the question here , into parts.

My input xml, indicates the presence of fields in a string. The input xml can have maximum 64 field elements. The input xml field elements always occur in ascending order. my input xml

<Root>
  <element>field2</element>
  <element>field3</element>
  <element>field21</element>
</Root>

The string is defined as a variable in the xslt.

my variable

<xsl:variable name="inputstring" select="'013112316145ABC0812345678'"/>

The input xml says that field 2, 3 and 21 are the only fields in the string, to be extracted based on a mapping xml

Here is the mapping xml

<Root>
  <field no="2" charlength="2">variable</field>
  <field no="3" total="4">fixed</field>
  <field no="21" charlength="2">
    <subfield no="1" total="3">fixed</subfield>
    <subfield no="2" charlength="2" idcode="ABC">variable</subfield>
  </field>
  <field no="63" charlength="2">
    <format1>
      <subfield no="1" total="3">fixed</subfield>
    </format1>
    <format2>
      <subfield no="1" total="3">fixed</subfield>
      <subfield no="2" total="7">fixed</subfield>
    </format2>
    <format3>
      <subfield no="1" total="3">fixed</subfield>
      <subfield no="2" total="7">fixed</subfield>
      <subfield no="3" total="6">fixed</subfield>
    </format3>
  </field>
</Root>

The mapping xml tells the following

  1. There are four types of fields, fixed, variable, field having subfields(with fixed and variable) and field having subfields(with different fomats)
  2. Field number 2 is a variable field(as above), and the first two characters(charlength attribute) indicate the length of the field
  3. Field 3 is a fixed one, with a total of 4 characters.
  4. Field 21 is a field having subfields(with fixed and variable), where the first two chars(charlength) indicates the number of chars of the Field
    • All fixed ones(subfields) occur first, followed by the variable subfields
    • The subfields in this, always starts with the idcode(for 21's sub, it is ABC), followed by the length of characters(charlength attribute), then the subfield. the length of chars can be 0 as well
    • All fixed and variable fields occur, the length of 0 indicates absence of a subfield(above point)
  5. Field 63 is a field having subfields(with different fomats), depending on the length of the field(charlength attribute), different formats are possible
    • For field 63, if the length is 03(first two chars, charlenghth attribute), it is format 1. If 10, format 2, if it is 16, then format3

My desired output xml

<Root>
  <field2>3</field2>
  <!--value is 3 as the charlength is 2(which is 01)-->
  <field3>1123</field3>
  <!--field3 value is 1123 as it is fixed, total length of 4-->
  <field21>
    <subfield1>145</subfield1>
    <!--subfield1 should be 145 as it is fixed length of total 3 chars-->
    <subfield2>12345678</subfield2>
    <!--sufield2 starts with 'ABC', has length 08 chars-->
  </field21>
</Root>

Edit by Sean.

Break-down

Here is a break-down of the mapping between input and output.

This is a picture of our string variable $inputstring

'013112316145ABC0812345678'

This is broken up into 3 fields according to the field definitions...

013    -      1123  -  16145ABC0812345678
 |              |              v  
 v              v           field 21
field2        field3  

Let's break-down field 2:

 01    3
  |    v
  |   payload for field 2. This is output
  v
Contains the length(1) of the payload, which in this case is '01' = 1
This length of this 'header' is given by mapping Root/field[@no="2"]/@charlength
The "2" in this expression comes from the input document node at Root/element .

Lets break-down field 21:

16   145   ABC0812345678
 |    |       v
 |    |     subfield 2
 |    \ subfield 1
  \
   v
   Header for field 2. Says that the total field 2 length (header + subfield 1 +
subfield 2 consists of 16 characters. The length for this header was derived from
the mapping node at Root/field[@no="21"]/@charlength .

And for the final example: a break-down of field 21/ subfield 2. This is a picture of subfield 2

ABC   08   12345678
 |     |     |
 |     |     v
 |     |    This is the payload. It is output as the text node child of output
 |     |      subfield 2
 |     v
 v    Length of the following payload
 Signature. The length and value is equal to the mapping node
   Root/field[@no="21"]/subfield[@no="2"]/@idcode
Community
  • 1
  • 1
Suresh
  • 1,081
  • 4
  • 21
  • 44
  • Suresh, I see a contradiction: Why is the `charlength` of `subfield2` of `field21` specified as `2`, while actually it is 8 (+3) ??? – Dimitre Novatchev Aug 20 '12 at 12:15
  • 1
    It's an interesting problem. What determines the length of variable fields & subfields? – Sean B. Durkin Aug 20 '12 at 12:22
  • Oh - I get it! The header characters give the length. I might edit the question to make it clearer. – Sean B. Durkin Aug 20 '12 at 12:27
  • That's a lot of work you are asking for Suresh. I hope you have made an effort to solve this yourself. Given your long line of XSLT questions, it might be time for you to pick up an XSLT book and do some reading. – Sean B. Durkin Aug 20 '12 at 12:53
  • @SeanB.Durkin, This edit is very nice. However it doesn't explain the contradiction that I am asking for in my first comment. Could you, please, explain? – Dimitre Novatchev Aug 20 '12 at 12:56
  • The charlength of field21 (value 2) specifies the length of the field's header, not the whole field. It is the value of this header (value 16) which gives you the length of the whole field (16 characters). – Sean B. Durkin Aug 20 '12 at 13:01
  • Subfield 2 is 13 characters long. It has a 3 character signature; a 2 character header and an 8 character payload. The content of the header is the length of the payload. – Sean B. Durkin Aug 20 '12 at 13:04
  • @DimitreNovatchev charlength specifies the number of characters. For subfield2 it is 2. it can be 00 till 99. In this case it is 08. Following 8 characters is '12345678'. – Suresh Aug 20 '12 at 13:05
  • @SeanB.Durkin. Yeah. I have been working on XSLt for a long time. Always this site was useful to me in getting guidance/answers(especially Dimitre). Thanks a lot for the Edit you have done. :) – Suresh Aug 20 '12 at 13:09
  • @Suresh, then the name of the attribute is misleading -- a better name would be: `charlength-length` . And a better name for `total` is `fixed-length` – Dimitre Novatchev Aug 20 '12 at 13:12
  • 1
    I'm not going to give you a stylesheet, but I will give you an approach. Use recursion with the unprocessed portion of the input string as a parameter. Use a test or tests to branch into the 3 types of fields: fixed, variable and compound. When the focus node is compound apply the algorithm recursively to the subfields. – Sean B. Durkin Aug 20 '12 at 13:17
  • Thanks Sean :) I knew that we have to use recursion for this complex template. However, was stuck with multiple conditions, ofcourse was not sure where to start with. The idea I had, apply templates on the input xml(which tells us the field), then each time, there is a match we know that a field is present. Currently stuck with manipulating the input string, each time a template is matched. Because we have to remove the previous portion(which the eralier template had matched), of the string and then process in the template. May be i have to think to use the string in a recursive manner. – Suresh Aug 20 '12 at 13:50

1 Answers1

2

Well..... I said I wouldn't do it, but I did it anyway.

Caveats

  1. Rule 5 (compound field with many formats) was not implemented. This was just too much work, so I leave the completion of Rule 5 to you.
  2. I tested on www.xmlper.com, so I used the msxsl:node-set(). If not using MS, you may need to adjust slightly for your XSLT engine.
  3. With this much string processing, you should really think hard about upgrading to XSLT 2.0 .

Style-sheet

This XSLT 1.0 style-sheet...

<xsl:stylesheet version="1.0"
   exclude-result-prefixes="xsl so msxsl"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   xmlns:so="http://stackoverflow.com/questions/12035679"
   xmlns:msxsl="urn:schemas-microsoft-com:xslt">
    <xsl:output method="xml" indent="yes" />
    <xsl:strip-space elements="*" />

    <xsl:variable name="inputstring" select="'013112316145ABC0812345678'" />

    <xsl:variable name="map">
        <so:mapping>
            <field no="2" charlength="2">variable</field>
            <field no="3" total="4">fixed</field>
            <field no="21" charlength="2">
                <subfield no="1" total="3">fixed</subfield>
                <subfield no="2" charlength="2" idcode="ABC">variable</subfield>
            </field>
        </so:mapping>
    </xsl:variable>

    <xsl:template match="/*">
        <xsl:copy>
            <xsl:call-template name="process-fields">
                <xsl:with-param name="element-stack" select="element" />
                <xsl:with-param name="code" select="$inputstring" />
            </xsl:call-template>
        </xsl:copy>
    </xsl:template>

    <xsl:template name="process-fields">
        <xsl:param name="element-stack" />
        <xsl:param name="code" />
        <xsl:if test="($code != '') and $element-stack">
            <xsl:variable name="field-no" select="
              substring-after($element-stack[1],'field')" />
            <xsl:variable name="field-parse-request">
                <so:field-parse-request code="{$code}">
                    <xsl:copy-of select="msxsl:node-set($map)/so:mapping/
                       field [@no=$field-no]" />
                </so:field-parse-request>
            </xsl:variable>
            <xsl:variable name="field-parse-result">
                <xsl:apply-templates
                    select="msxsl:node-set($field-parse-request)/*"
                    mode="field-parse" />
            </xsl:variable>
            <xsl:apply-templates
                   select="msxsl:node-set($field-parse-result)/so:output/*"
                   mode="remove-namespaces" />
            <xsl:call-template name="process-fields">
                <xsl:with-param name="element-stack"
                   select="$element-stack[position() &gt; 1]" />
                <xsl:with-param name="code"
                   select="msxsl:node-set($field-parse-result)/
                           so:output[1]/@code" />
            </xsl:call-template>
        </xsl:if>
    </xsl:template>

    <xsl:template match="so:field-parse-request[field[subfield]]"
                     mode="field-parse">
        <so:output>
            <xsl:variable name="header"
                  select="substring(@code,1,field/@charlength)" />
            <xsl:attribute name="code">
                <xsl:value-of
                   select="substring(@code,1+field/@charlength+$header)" />
            </xsl:attribute>
            <xsl:element name="field{field/@no}">
                <xsl:call-template name="process-subfields">
                    <xsl:with-param name="subfield-stack" select="field/subfield" />
                    <xsl:with-param
                      name="code"
                      select="substring(@code,1+field/@charlength,$header)" />
                </xsl:call-template>
            </xsl:element>
        </so:output>
    </xsl:template>

    <xsl:template match="so:field-parse-request[field[.='variable']]"
             mode="field-parse">
        <so:output>
            <xsl:variable name="header"
                   select="substring(@code,1,field/@charlength)" />
            <xsl:attribute name="code">
                <xsl:value-of select="substring(@code,1+field/@charlength+$header)" />
            </xsl:attribute>
            <xsl:element name="field{field/@no}">
                <xsl:value-of select="substring(@code,1+field/@charlength,$header)" />
            </xsl:element>
        </so:output>
    </xsl:template>

    <xsl:template match="so:field-parse-request[subfield[.='variable']]"
            mode="field-parse">
        <so:output>
            <xsl:variable name="header"
               select="substring( @code,
                                  1 + string-length( subfield/@idcode),
                                  subfield/@charlength)" />
            <xsl:attribute name="code">
                <xsl:value-of select="substring(
                     @code,
                     1 + string-length( subfield/@idcode) +
                         subfield/@charlength + $header)" />
            </xsl:attribute>
            <xsl:element name="subfield{subfield/@no}">
                <xsl:value-of select="
                   substring( @code, 
                     1 + string-length( subfield/@idcode) +
                       subfield/@charlength, $header)" />
            </xsl:element>
        </so:output>
    </xsl:template>

    <xsl:template match="so:field-parse-request[ field[.='fixed']] | so:field-parse-request[subfield[.='fixed']]" mode="field-parse">
        <so:output>
            <xsl:attribute name="code">
                <xsl:value-of select="substring(@code, (field/@total | subfield/@total) + 1)" />
            </xsl:attribute>
            <xsl:element name="{concat( name(field|subfield) ,field/@no | subfield/@no)}">
                <xsl:value-of select="substring(@code,1,field/@total | subfield/@total)" />
            </xsl:element>
        </so:output>
    </xsl:template>

    <xsl:template name="process-subfields">
        <xsl:param name="subfield-stack" />
        <xsl:param name="code" />
        <xsl:if test="($code != '') and $subfield-stack">
            <xsl:variable name="active-subfield-index">
                <xsl:choose>
                    <xsl:when test="not( $subfield-stack[1]/@idcode)">1</xsl:when>
                    <xsl:otherwise>
                        <xsl:value-of select="
                           count($subfield-stack
                            [starts-with($code,@idcode)]/preceding-sibling::*)+1" />
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:variable>
            <xsl:variable name="field-parse-request">
                <so:field-parse-request code="{$code}">
                    <xsl:copy-of select="$subfield-stack[$active-subfield-index]" />
                </so:field-parse-request>
            </xsl:variable>
            <xsl:variable name="field-parse-result">
                <xsl:apply-templates
                  select="msxsl:node-set($field-parse-request)/*"
                  mode="field-parse" />
            </xsl:variable>
            <xsl:apply-templates
                select="msxsl:node-set($field-parse-result)/so:output/*"
                mode="remove-namespaces" />
            <xsl:call-template name="process-subfields">
                <xsl:with-param name="subfield-stack"
                  select="$subfield-stack[position() != $active-subfield-index]" />
                <xsl:with-param name="code"
                   select="msxsl:node-set($field-parse-result)/ 
                           so:output[1]/@code" />
            </xsl:call-template>
        </xsl:if>
    </xsl:template>

    <xsl:template match="*" mode="remove-namespaces">
        <xsl:element name="{local-name(.)}">
            <xsl:apply-templates select="@*|node()" mode="remove-namespaces" />
        </xsl:element>
    </xsl:template>

    <xsl:template match="@*|text()" mode="remove-namespaces">
        <xsl:copy />
    </xsl:template>

</xsl:stylesheet>

Input

...will take this input document...

<Root>
  <element>field2</element>
  <element>field3</element>
  <element>field21</element>
</Root>

Output

... and transform it according to all the stated rules, except rule 5, and produce this output...

<Root>
  <field2>3</field2>
  <field3>1123</field3>
  <field21>
    <subfield1>145</subfield1>
    <subfield2>12345678</subfield2>
  </field21>
</Root>
Sean B. Durkin
  • 12,659
  • 1
  • 36
  • 65