0

First,I don't have good enough experience on xslt. I have received xml file from thirdparty and some of field lengths are not acceptable in our system. I would like to split the string and make well xml format before read the xml file from our system. I know it can be done with xslt but I do not have enough xslt knowledge.

In this example, we assume "Doc_Header_Comments" and "Doc_Line_Comments" max length is 6 characters only accept in our system.

Here is sample input and except format after transform with xslt.

Input xml format

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Inbound_Sales_Document xmlns="urn:xmlports/x50000">
  <Doc_Sender_Id />
  <Doc_Date_Format>yyyyMMdd</Doc_Date_Format>
  <Doc_Time_Format>hhmmss</Doc_Time_Format>
  <Doc_Header>
    <General>
      <Doc_Header_Doc_Type>Order</Doc_Header_Doc_Type>
      <Doc_Header_Doc_No>202011231836178441</Doc_Header_Doc_No>
      <Doc_Header_Order_Date>20200023</Doc_Header_Order_Date>
      <Doc_Header_Comments>ABCDEFGHIJKLMNOP</Doc_Header_Comments>
    </General>
  </Doc_Header>
  <Doc_Line>
    <Doc_Line_Order_No>202011231836178441</Doc_Line_Order_No>
    <Doc_Line_Order_Line_No>1</Doc_Line_Order_Line_No>
    <Doc_Line_Comments>ABCDEFGHIJKLMNOP</Doc_Line_Comments>
  </Doc_Line>
  <Doc_Line>
    <Doc_Line_Order_No>202011231836178441</Doc_Line_Order_No>
    <Doc_Line_Order_Line_No>1</Doc_Line_Order_Line_No>
    <Doc_Line_Comments>ABC</Doc_Line_Comments>
  </Doc_Line>
</Inbound_Sales_Document>

Output xml format after transform with xslt

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Inbound_Sales_Document xmlns="urn:xmlports/x50000">
  <Doc_Sender_Id />
  <Doc_Date_Format>yyyyMMdd</Doc_Date_Format>
  <Doc_Time_Format>hhmmss</Doc_Time_Format>
  <Doc_Header>
    <General>
      <Doc_Header_Doc_Type>Order</Doc_Header_Doc_Type>
      <Doc_Header_Doc_No>202011231836178441</Doc_Header_Doc_No>
      <Doc_Header_Order_Date>20200023</Doc_Header_Order_Date>
      <Doc_Header_Comments>
          <Comments>ABCDE</Comments>
          <Comments>FGHIJ</Comments>
          <Comments>KLMNO</Comments>
          <Comments>P</Comments>
      </Doc_Header_Comments>
    </General>
  </Doc_Header>
  <Doc_Line>
    <Doc_Line_Order_No>202011231836178441</Doc_Line_Order_No>
    <Doc_Line_Order_Line_No>1</Doc_Line_Order_Line_No>
    <Doc_Line_Comments>
        <Comments>ABCDE</Comments>
        <Comments>FGHIJ</Comments>
        <Comments>KLMNO</Comments>
        <Comments>P</Comments>
    </Doc_Line_Comments>
  </Doc_Line>
  <Doc_Line>
    <Doc_Line_Order_No>202011231836178441</Doc_Line_Order_No>
    <Doc_Line_Order_Line_No>1</Doc_Line_Order_Line_No>
    <Doc_Line_Comments>
        <Comments>ABC</Comments>
    </Doc_Line_Comments>
  </Doc_Line>
</Inbound_Sales_Document> 

I found similar case at XSLT - split string on every nth character in loop but I have no idea how to make it as my requirement.

Could someone pls help here. Many thanks

ib3an
  • 241
  • 1
  • 4
  • 14

1 Answers1

0

Assuming XSLT 2.0+,

<xsl:template match="(Doc_Header_Comments | Doc_Line_Comments)[string-length() gt 6]">
  <xsl:variable name="text" select="string(.)"/>
  <xsl:copy>
     <xsl:for-each select="0 to (string-length($text) idiv 6) - 1">
        <Comments>
           <xsl:value-of select="substring($text, 1 + .*6, 6)"/>
        </Comments>
     </xsl:for-each>
  </xsl:copy>
</xsl:template>

If for some reason you're still using the older XSLT 1.0, then it needs a recursive named template.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • thanks @Michale, unfortunately it does not work. I test at https://www.freeformatter.com/xsl-transformer.html it's showing me error. I use this xslt with your xslt code – ib3an Dec 04 '20 at 16:24
  • 1
    I'm not sure if the freeformatter.com tool you are using supports XSLT 2. Also your input XML contains a namespace declaration so you need to take it into account. See the example working here : https://xsltfiddle.liberty-development.net/ei5R4uk. The last part of the cut-out Comment is left out, so you'd need to look into the XSLT. – Sebastien Dec 04 '20 at 16:45
  • 1
    The add the last part of the cut-up string : Add this after the for-each : . See it working here : https://xsltfiddle.liberty-development.net/ei5R4uk/1 – Sebastien Dec 04 '20 at 16:51
  • 1
    I tried freeformatter with a 2.0 stylesheet and all it said was "Unable to generate the XML document using the provided XML/XSL input. Errors were reported during stylesheet compilation" - it didn't tell me what the errors were! That's not going to be a very useful tool, then. – Michael Kay Dec 04 '20 at 17:38
  • Tried freeformatter again outputting `system-property('xsl:vendor')` and it says `Saxonica`, tried `xsl:version` and it says 3.0, tried `xsl:product-version` and it says `HE 9.7.0.1`, which is a bit old but not ridiculously; certainly this code should work. So the real problem is the lack of diagnostics from this tool. – Michael Kay Dec 04 '20 at 17:42
  • I tried to find out what it thinks the errors are and I've no idea, because Saxon runs it without error, though as @Sebastien says it needs some tweaks to get the output right. – Michael Kay Dec 04 '20 at 17:49
  • Hi @Sebastien and Michale Kay, really appreciated to your comment and solution. It's beautiful and working well. I would like to remove the all namespace from output or input xml before apply your xlst code. Is it possible? i try to figure out but i have not enough knowledge. Many thanks – ib3an Dec 05 '20 at 07:33