0

I'm using PHPs DOM to build a HTML document.

At the end of the document, I create a script element.

If the script has any entites, specifically, < and >, then these are converted to &lt; and &gt;

This is obviously a problem if I have any strings containing those characters (or in my case regexs)

Is there a non hackish way (ie NOT string replacement) to prevent this behaviour in the script tags ONLY?

Pez
  • 172
  • 2
  • 15
  • [As answered](http://stackoverflow.com/a/18487888/367456), this normally works. Here is an online demo for multiple PHP and libxml versions: http://3v4l.org/ntvAh - And you might be interested in reading this: [When is a CDATA section necessary within a script tag?](http://stackoverflow.com/q/66837/367456) / [4.8. Script and Style elements - XHTML 1.0](http://www.w3.org/TR/xhtml1/#h-4.8) – hakre Aug 28 '13 at 12:33
  • That's great thanks, looks like I need to add the content of the script element as a CDATA section! – Pez Aug 28 '13 at 12:52
  • I added a CDATA example XML/HTML hybrid code in my answer. DOMDocument is clever enough to differ at the time when you output. If you take that in mind and insert CDATA firsthand, you can even easily change later on. – hakre Aug 28 '13 at 13:02
  • Thanks, that clicked as soon as I read those links you sent; Is there a way to avoid javascript generating errors because of the CDATA tags? I realise that they're supposed to be allowed as part of the spec, but it seems that isn't the case in practise – Pez Aug 28 '13 at 13:32
  • I get 'Syntax Error <![CDATA[' from my browser – Pez Aug 28 '13 at 13:59
  • if you get that, you are giving the wrong mime-type for your XML document, your browser is in HTML mode, not XML mode. Fix that or use HTML instead of XML. – hakre Aug 28 '13 at 14:11

1 Answers1

2

This normally is not a problem. Those characters are only encoded as &lt; or &gt; if you use DOMDocument::saveXML(). If you use DOMDocument::saveHTML() those are just < and > in a <script> tag.

Example:

<?php
/**
 * PHP DOM and JavaScript with HTML entities
 *
 * @link http://stackoverflow.com/q/18487515/367456
 */

$doc = new DOMDocument("1.0");
$doc->loadXML('<head/>');

$javascriptCode = "\n  if (1 < 4) {\n    alert(\"hello\");\n  }\n";

$script = $doc->createElement('script');
$script->appendChild($doc->createCDATASection($javascriptCode));

$head         = $doc->getElementsByTagName('head')->item(0);
$scriptInHead = $head->appendChild($script);

echo 'libxml: ', LIBXML_DOTTED_VERSION, "\n"
    , "\nXML:\n", $doc->saveXML()
    , "\nHTML:\n", $doc->saveHTML()
;

Program Output (Demo (Multi-Version)):

libxml: 2.7.8

XML:
<?xml version="1.0"?>
<head><script><![CDATA[
  if (1 < 4) {
    alert("hello");
  }
]]></script></head>

HTML:
<head><script>
  if (1 < 4) {
    alert("hello");
  }
</script></head>
hakre
  • 193,403
  • 52
  • 435
  • 836
  • Cheers, I understand that part, but I'm using XHTML and as you probably know, saveHTML() doesn't always produce valid XHTML (especially if you use self closing tags...) Does this imply that I'm going to have to wrestle with my HTML template and saveHTML to get it to produce valid XHTML with the script? – Pez Aug 28 '13 at 12:49
  • @Pez: In X(HT)ML you insert a CDATA element. See my comment below your question. – hakre Aug 28 '13 at 12:50
  • @Pez: Can you please provide code with your question *how* you create XHTML with PHP DOMDocument. I would be interested to see that. – hakre Aug 28 '13 at 14:35