4

I am using <xsl:template match="m:*/text()"> to match text in my XML Document, which is working fine for plain text and known entities, i.e. it works fine for entities like &amp; or unicode entities like &#x003C0;.

However what's not working is matching custom entity names. For example I have an entity &pi; in my XML Document, that should be matched using text(). For some reason it does not treat that entity as text, meaning nothing is being matched.

Please note that I did declare the entity name in the Doctype declaration of the XML Document, and of the XSLT Document as well:

<!DOCTYPE xsl:stylesheet [<!ENTITY pi "&#x003C0;">]>

Is text() the right approach to matching custom entity names, or do I need to use another function? (Maybe I also did something wrong declaring the entity name?)

Thanks

Edit

XML

<!DOCTYPE mathml [<!ENTITY pi "&#x003C0;">]>
<math xmlns="http://www.w3.org/1998/Math/MathML" display="inline">    
    <mi>&pi;</mi>
    <mi>test</mi>
    <mi>&#x003C0;</mi>
</math>

XSLT

<?xml version='1.0' encoding="UTF-8"?>
<!DOCTYPE xsl:stylesheet [<!ENTITY pi "&#x003C0;">]>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:m="http://www.w3.org/1998/Math/MathML"
                version='1.0'>

    <xsl:template match="m:*/text()">
        <xsl:call-template name="replaceEntities">
            <xsl:with-param name="content" select="normalize-space()"/>
        </xsl:call-template>
    </xsl:template>

    <xsl:template name="replaceEntities">
        <xsl:param name="content"/>
        <xsl:value-of select="$content"/>
    </xsl:template>
</xsl:stylesheet>

The variable $content should get printed three times, however only test and &#x003C0; is printed.

Processing using PHP

$xslDoc = new DOMDocument();
$xslDoc->load("doc.xsl");
$xslProcessor = new \XSLTProcessor();
$xslProcessor->importStylesheet($xslDoc);
$mathMLDoc = new DOMDocument();
$mathMLDoc->loadXML('<!DOCTYPE mathml [<!ENTITY pi "&#x003C0;">]><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&pi;</mi><mi>test</mi><mi>&#x003C0;</mi></math>');
echo $xslProcessor->transformToXML($mathMLDoc);
ksbg
  • 3,214
  • 1
  • 22
  • 35
  • Just a guess here, but could it be that the XML parser parsing your XSLT swaps the `&pi` with `π` and that's why `text()` won't match it? – Anders R. Bystrup Apr 20 '15 at 14:01
  • 2
    Please show a complete, minimal and verifiable sample of both your input XML and your XSLT stylesheet: http://stackoverflow.com/help/mcve. Thanks. – Mathias Müller Apr 20 '15 at 14:03
  • I don't think so, since it does match it if I replace `π` in the XML document with `π`. So even if it would be swapped, `text()` should still be able to match it. @MathiasMüller I will – ksbg Apr 20 '15 at 14:03
  • and explain how you are parsing the XML and invoking the stylesheet - `text()` should match _any_ text node, so your issue is probably something specific to the way you're invoking the transformation rather than anything wrong with the transformation itself. – Ian Roberts Apr 20 '15 at 14:04
  • Interesting, the output from Saxon 9.5 is `πtestπ`, using the PHP code it's `testπ`. – Mathias Müller Apr 20 '15 at 14:27
  • That is indeed interesting. I wonder if there is a way to get the right output using PHP's XSLTProcessor. – ksbg Apr 20 '15 at 14:37

1 Answers1

4

As far as I can see, the problem is that the DTD is not visible to the XSLT stylesheet. Use the following to substitute entities with their textual value before transforming the document:

$mathMLDoc->substituteEntities = true;

as in

$xslDoc = new DOMDocument();
$xslDoc->load("tree.xsl");
$xslProcessor = new \XSLTProcessor();
$xslProcessor->importStylesheet($xslDoc);
$mathMLDoc = new DOMDocument();
$mathMLDoc->substituteEntities = true;
$mathMLDoc->loadXML('<!DOCTYPE math [<!ENTITY pi "&#x003C0;">]><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><mi>&pi;</mi><mi>test</mi><mi>&#x003C0;</mi></math>');
echo $xslProcessor->transformToXML($mathMLDoc);

which will produce

<?xml version="1.0"?>
πtestπ

Some background: http://php.net/manual/en/xsltprocessor.transformtoxml.php#99932 and http://hublog.hubmed.org/archives/001854.html.

Mathias Müller
  • 22,203
  • 13
  • 58
  • 75