0

We've been looking to generate a hash of a certain text from a given document and came up with the following version of XSLT:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:iway="http://iway.company.com/saxon-extension">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" exclude-result-prefixes="iway"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="*[not(descendant::text()[normalize-space()])]"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>     
    </xsl:template>
    
    <xsl:template match="row" exclude-result-prefixes="iway">
    <xsl:variable name="jsonForHash" select="JSON_Output/text()"/>
    <xsl:variable name="iflExpression" select="concat('_sha1(''', $jsonForHash, ''')')"/>
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
            <CurrentDataHash type="12" typename="varchar"><xsl:value-of select="iway:ifl($iflExpression)"/></CurrentDataHash>   
            <Duplicity type="12" typename="varchar"><xsl:value-of select="$jsonForHash = LastDataHash/text()"/></Duplicity>     
        </xsl:copy>     
    </xsl:template>

</xsl:stylesheet>

...which does the job. The downside is that, it couldn't tested locally (on Altova/Stylus Studio) without modification and we would like to be able to do it. This is functional only in runtime that relies on Saxon-HE-9*. In an attempt to fix this, we gave the below version a shot (inspired from HERE):

<xsl:transform version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:digest="java?path=jar:file:///C:/libs/commons-codec-1.13.jar!/">
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>

    <xsl:template match="/">
        <Output>
            <xsl:apply-templates mode="hash"/>
        </Output>
    </xsl:template>
    
    <xsl:template match="SKU_SEG" mode="hash">
        <Group>
            <xsl:variable name="val" select="."/>
            <xsl:copy-of select="$val"/>
            <xsl:variable name="hash-val" select="digest:org.apache.commons.codec.digest.DigestUtils.md5Hex($val)"/>
            <HashValue>
                <xsl:value-of select="$hash-val"/>
            </HashValue>
        </Group>
    </xsl:template>
    
</xsl:transform>

...which works only locally on Altova but does not work in runtime as we use Saxon-HE but the feature is supported only on Saxon-PE/EE. In order to overcome this, we came up with this version:

<xsl:transform version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:digest="java?path=jar:file:///C:/libs/commons-codec-1.13.jar!/" xmlns:iway="http://iway.company.com/saxon-extension" exclude-result-prefixes="digest iway">
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes" exclude-result-prefixes="digest iway"/>
    <xsl:template match="/">
        <Output>
            <xsl:apply-templates mode="hash"/>
        </Output>
    </xsl:template>
    <xsl:template match="SKU_SEG" mode="hash">
        <xsl:variable name="parserInfo" select="system-property('xsl:vendor')"/>
        <Group>
            <xsl:variable name="textForHash" select="."/>
            <xsl:variable name="iflExpression" select="concat('_sha1(''', $textForHash, ''')')"/>
            <xsl:copy-of select="$textForHash"/>
            <xsl:variable name="hashedVal">
                <xsl:choose>
                    <xsl:when test="contains(lower-case($parserInfo), 'saxon')">
                        <xsl:value-of select="iway:ifl($textForHash)"/>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:value-of select="digest:org.apache.commons.codec.digest.DigestUtils.md5Hex($textForHash)"/>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:variable>
            <HashValue>
                <xsl:value-of select="$hashedVal"/>
            </HashValue>
        </Group>
    </xsl:template>
</xsl:transform>

...which works locally on Altova XMLSpy but not in runtime as Saxon complains the following:

net.sf.saxon.trans.XPathException: 
Cannot find a 1-argument function named 
Q{java?path=jar:file:///C:/libs/commons-codec-1.13.jar!/}
org.apache.commons.codec.digest.DigestUtils.md5Hex(). 
Reflexive calls to Java methods are not available under Saxon-HE

Now the question: Is it possible to achieve the requirement at all? Thanks in advance.

Setup Info: 
Runtime: Java Application relying on Saxon-HE
XSLT Versions Supported: 1/2/3
Standalone Tool for local tests: Altova XMLSpy

PS: The below version (inspired from HERE) appears to work both locally and remotely, if the text to be hashed is not too long, but the text that is being hashed here is too long, longer that what's permitted on an HTTP URL, thus is not an option:

<xsl:transform version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
    
    <xsl:template match="/">
        <Output>
            <arg0>
                <xsl:text disable-output-escaping="yes">&lt;![CDATA[</xsl:text>
                <xsl:copy>
                    <xsl:apply-templates/>
                </xsl:copy>
                <xsl:text disable-output-escaping="yes">]]&gt;</xsl:text>
            </arg0>
            <arg1>
                <xsl:apply-templates mode="hash"/>
            </arg1>
        </Output>
    </xsl:template>
    
    <xsl:template match="SKU_SEG">
        <xsl:copy-of select="."/>
    </xsl:template>
    
    <xsl:template match="SKU_SEG" mode="hash">
        <xsl:variable name="val" select="."/>
        <!-- delegate to an external REST service to calculate the MD5 hash of the value -->
        <xsl:variable name="hash-val" select="unparsed-text(concat('http://localhost/md5?text=', encode-for-uri($val)))"/>
        <!-- the response from this service is wrapped in quotes, so need to trim those off -->
        <xsl:value-of select="substring($hash-val, 2, string-length($hash-val) - 2)"/>
    </xsl:template>
    
</xsl:transform>

For reference, here is the Saxon extension function:

 private void registeriWayXsltExtensions_iFLEval(final XDDocument docIn) {
    log(".init() Registering iWay XSLT extensions...", "info");
    this.iway_xslt_extension_ifl = new ExtensionFunction() {
        public QName getName() {
          return new QName("http://iway.company.com/saxon-extension", "ifl");
        }
        
        public SequenceType getResultType() {
          return SequenceType.makeSequenceType(ItemType.STRING, OccurrenceIndicator.ONE);
        }
        
        public SequenceType[] getArgumentTypes() {
          return 
            new SequenceType[] { SequenceType.makeSequenceType(ItemType.STRING, OccurrenceIndicator.ONE) };
        }
        
        public XdmValue call(XdmValue[] arguments) throws SaxonApiException {
          String iflExpression = ((XdmAtomicValue)arguments[0].itemAt(0)).getStringValue();
          SaxonXsltAgent.this.log(".execute()  Received iFL Expression: " + iflExpression, "info");
          String iflResult = null;
          if (iflExpression != null && !iflExpression.equals(""))
            iflResult = XDUtil.evaluate(iflExpression, docIn, SaxonXsltAgent.this.getSRM()); 
          return (XdmValue)new XdmAtomicValue(iflResult);
        }
      };
    this.xsltProcessor.registerExtensionFunction(this.iway_xslt_extension_ifl);
    log(".execute() \"ifl\" registered.", "info");
  }
Srii
  • 543
  • 3
  • 7
  • 20
  • Does this answer your question? [Reflexive calls to Java methods are not available under Saxon-HE](https://stackoverflow.com/questions/53846708/reflexive-calls-to-java-methods-are-not-available-under-saxon-he) – f1sh Feb 01 '23 at 14:02
  • No, it does not. One of the versions pasted in the raised question above already uses "integrated extension functions" of Saxon-HE which is how that piece of code works in runtime. Notice the use of "iway" namespaced functions. Purchase of Saxon-PE/EE is not an option, atm. Thus my look-out for other options. – Srii Feb 01 '23 at 14:16
  • Doesn't Stylus Studio at least allow you to run/test your XSLT code with Saxon HE and your integrated extension function? I think in oXygen you can provide a library path for such extension functions. – Martin Honnen Feb 01 '23 at 14:18
  • You don't actually say what the hash function is for. Is this something cryptographic, or is it for grouping and equality matching? – Michael Kay Feb 01 '23 at 14:42
  • The application generates a sha1 hash of every XML document that it transfers to a target system. The generated hash is held on a db lookup table against what we call a material number. The application would receive a document for the same material number more than once and is suppose to perform the transfer to target only when something has changed in the current document from what was sent lastly. This is being achieved by comparing the sha1 hash of the current document with that of its previous held on the lookup table. – Srii Feb 01 '23 at 14:51
  • @MartinHonnen - I am still trying to figure this out on XMLSpy... – Srii Feb 01 '23 at 15:08
  • I have added a more complete example of my suggested approach to my answer; indeed it requires Saxon HE 10 or later as unfortunately the higher-order feature for function-lookup does not work with earlier HE versions. Tested only with Saxon HE and EE from Java or command line, don't have access to XMLSpy. – Martin Honnen Feb 01 '23 at 17:18

2 Answers2

1

I would try e.g.

<xsl:value-of 
  select="iway:ifl($textForHash)" 
  use-when="exists(function-lookup(QName('http://iway.company.com/saxon-extension', 'ifl'), 1))"/>

and

<xsl:value-of select="disest:org.apache.commons.codec.digest.DigestUtils.md5Hex($textForHash)" 
  use-when="exists(function-lookup(QName('java?path=jar:file:///C:/libs/commons-codec-1.13.jar!/', 'org.apache.commons.codec.digest.DigestUtils.md5Hex'), 1))"/>`.

Here is a complete example of that approach (that should work with Saxon HE 10 or later only, admittedly, as earlier HE versions didn't support higher-order functions):

package org.example;

import net.sf.saxon.s9api.*;

import javax.xml.transform.stream.StreamSource;

public class Main {
    public static void main(String[] args) throws SaxonApiException {
        Processor processor = new Processor(true);

        ExtensionFunction sqrt = new ExtensionFunction() {
            public QName getName() {
                return new QName("http://example.org/mf", "sqrt");
            }

            public SequenceType getResultType() {
                return SequenceType.makeSequenceType(
                        ItemType.DOUBLE, OccurrenceIndicator.ONE
                );
            }

            public SequenceType[] getArgumentTypes() {
                return new SequenceType[]{
                        SequenceType.makeSequenceType(
                                ItemType.DOUBLE, OccurrenceIndicator.ONE)};
            }

            public XdmValue call(XdmValue[] arguments) throws SaxonApiException {
                double arg = ((XdmAtomicValue)arguments[0].itemAt(0)).getDoubleValue();
                double result = Math.sqrt(arg);
                return new XdmAtomicValue(result);
            }
        };

        processor.registerExtensionFunction(sqrt);

        XsltCompiler xsltCompiler = processor.newXsltCompiler();

        Xslt30Transformer xslt30Transformer = xsltCompiler.compile(new StreamSource("sheet1.xsl")).load30();

        xslt30Transformer.callTemplate(null, xslt30Transformer.newSerializer(System.out));
    }
}

XSLT

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="3.0"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                xmlns:mf="http://example.org/mf"
                xmlns:java-math="java:java.lang.Math"
                exclude-result-prefixes="#all"
                expand-text="yes">

    <xsl:mode on-no-match="shallow-copy"/>

    <xsl:output indent="yes"/>

    <xsl:template match="/" name="xsl:initial-template">
        <test>
            <integrated-extension-function>
                <xsl:value-of select="mf:sqrt(4)" use-when="exists(function-lookup(QName('http://example.org/mf', 'sqrt'), 1))"/>
            </integrated-extension-function>
            <reflexive-extension-function>
                <xsl:value-of select="java-math:sqrt(4)" use-when="exists(function-lookup(QName('java:java.lang.Math', 'sqrt'), 1))"/>
            </reflexive-extension-function>
        </test>
        <xsl:comment>Run with {system-property('xsl:product-name')} {system-property('xsl:product-version')} {system-property('Q{http://saxon.sf.net/}platform')}</xsl:comment>
    </xsl:template>

</xsl:stylesheet>

When run with the above Java code registering the function in Saxon HE 11 the result is e.g.

<?xml version="1.0" encoding="UTF-8"?>
<test>
   <integrated-extension-function>2</integrated-extension-function>
   <reflexive-extension-function/>
</test>
<!--Run with SAXON HE 11.4 -->

when running the XSLT through Saxon EE without registering the integrated extension function the output is

<?xml version="1.0" encoding="UTF-8"?>
<test>
   <integrated-extension-function/>
   <reflexive-extension-function>2</reflexive-extension-function>
</test>
<!--Run with SAXON EE 11.4 -->

So with the (xsl):use-when="exists(function-lookup(..))" you can conditionally inject code only while a certain function is available.

Sample project on Github: https://github.com/martin-honnen/SaxonHEIntegratedExtFnSample2

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • thanks for the feedback. The use-when property appears to be a higher-order function and is not supported in Saxon-HE... – Srii Feb 01 '23 at 14:36
  • Higher order functions are supported in more recent versions of Saxon-HE, but they don't provide a way of bypassing the restrictions. – Michael Kay Feb 01 '23 at 14:40
  • Saxon 10 HE and later support higher-order functions. – Martin Honnen Feb 01 '23 at 15:05
  • 1
    @MichaelKay, I don't think the question is about avoiding restrictions, it is about having one stylesheet that uses some reflexive extension function if available or an integrated if available. My answer simply tries to check whether a function is available and should conditionally inject/execute code only if it is available. – Martin Honnen Feb 01 '23 at 17:17
  • @MartinHonnen A test with latest version Saxon-HE (12) indeed appears to address the problem. My gratitude for your time and efforts. – Srii Feb 02 '23 at 09:39
0

If you want to call out to Java in SaxonJ-HE then you need to implement an "integrated extension function" and register it with the Saxon configuration, rather that relying on dynamic loading and reflexive invocation.

It's not difficult: see https://www.saxonica.com/documentation12/index.html#!extensibility/extension-functions-J/ext-simple-J

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • Please note that, we are already using an extension function which is how the evaluation/execution of the following expression succeeds: iway:ifl($iflExpression). I'll edit the question to include the extension function for better readability as comment section doesnt format. – Srii Feb 01 '23 at 14:56
  • 1
    You mean the requirement is to write an extension function that works both under Saxon and under Altova? I think you can achieve two extension functions, one for each processor, that have the same interface and the same effect, but I think you'll need two different implementations to satisfy the different APIs. – Michael Kay Feb 01 '23 at 15:08