I have a bit of an odd issue that I've run into that I'm sure is an encoding error, but in troubleshooting that error PHP is displaying an odd behavior I'm hoping someone can help me make sense of.
I have some xml that is being generated via XQuery:
<?xml version="1.0" encoding="UTF-8"?>
<list>
<item>
<orig>London, British Library Harley 2251: <ref target="Quis_Dabit/British_Library_Harley_2251/British_Library_Harley_2251_f42v.html">
<orig xmlns="http://www.tei-c.org/ns/1.0">O alle ye doughtres · of Jerusalem</orig>
</ref>
</orig>
</item>
<item>
<orig>London, British Library Harley 2255: <ref target="Quis_Dabit/British_Library_Harley_2255/British_Library_Harley_2255_f67r.html">
<orig xmlns="http://www.tei-c.org/ns/1.0">
<hi rend="blue_pilcrow">¶</hi>O alle ye douħtren of <hi rend="underline">ierusaleem</hi>
</orig>
</ref>
</orig>
</item>
<item>
<orig>Long Melford, Holy Trinity Church Clopton Chantry Chapel: <ref target="Quis_Dabit/Clopton/ww_qd_2.html">
<orig xmlns="http://www.tei-c.org/ns/1.0">
<hi>O</hi> alle ye <gap quantity="8" unit="chars" reason="illegible"/>s of ierusaleem</orig>
</ref>
</orig>
</item>
<item>
<orig>Cambridge, Jesus College Q.G.8: <ref target="Quis_Dabit/Jesus_College_Q_G_8/Jesus_Q_G_8_f20r.html">
<orig xmlns="http://www.tei-c.org/ns/1.0">
<hi>A</hi>ll the <hi rend="underline">doughtren </hi>of <hi rend="underline">Ierusalem</hi> .</orig>
</ref>
</orig>
</item>
<item>
<orig>Oxford, Bodleian Library Laud 683: <ref target="Quis_Dabit/Laud_683/Laud_683_f78v.html">
<orig xmlns="http://www.tei-c.org/ns/1.0">O alle ẏe douhtren of jerusaleem</orig>
</ref>
</orig>
</item>
<item>
<orig>Oxford, St. John's College 56: <ref target="Quis_Dabit/St_John_56/St_John_56_73v.html">
<orig xmlns="http://www.tei-c.org/ns/1.0">O alle the doughtren / of Jerusalem ؛</orig>
</ref>
</orig>
</item>
</list>
I then import it into php:
$text = exec ("java -cp saxon9he.jar net.sf.saxon.Query -t -q:test.xq");
$xml = new DOMDocument;
$xml->loadXML($text);
$xsl = new DOMDocument;
$xsl->load('comparison.xsl');
// Configure the transformer
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl); // attach the xsl rules
echo $proc->transformToXML($xml);
and attach an xsl stylesheet to it.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="1.0">
<xsl:output method="html" encoding="UTF-8"/>
<xsl:template match="list">
<div class="comparison">
<ul>
<xsl:apply-templates/>
</ul>
</div>
</xsl:template>
<xsl:template match="item">
<li>
<xsl:apply-templates/>
</li>
</xsl:template>
However, when I do so the encoding of the resulting output gets weird on non-standard characters, as seen here:
My assumption was that it is an encoding issue with the results, so I added a print_r statement to show me both the raw xml generated and the DOM tree, then refreshed.
I don't doubt it's an encoding error and I plan on tracking it down, but what I want to know is why it displays correctly if I add a print_r statement, but does not if I don't. Is there something I should add to the php file that I haven't? Thanks!