22
<script type="text/javascript">
    function test()    {
        alert('&lt;span&gt;blah&lt;span&gt;');
    }
</script>
<a href="#" onclick="test();">First</a><br />
<a href="#" onclick="alert('&lt;span&gt;blah&lt;span&gt;');">Second</a><br />
Third: &lt;span&gt;blah&lt;span&gt;

Demo: http://jsfiddle.net/LPYTZ/

Why is the first result different? Are <script> tags somehow excluded from entity conversion?

AndreKR
  • 32,613
  • 18
  • 106
  • 168

2 Answers2

29

In HTML, script and style elements are defined in the DTD as containing CDATA. This means that entities and tags are ignored until the parser hits something that looks like an end tag.

XHTML is different and entities and tags inside those elements function as normal — but only when parsed as XHTML. You can explicitly mark content as CDATA with <![CDATA[ … ]]>.

Browsers will treat XHTML served as text/html using HTML rules which leads to a big ball of nasty as you try to write code that is correct under both sets of rules.

The simplest way to avoid problems is to keep scripts in external files and use the src attribute to include them.

Quentin
  • 914,110
  • 126
  • 1,211
  • 1,335
  • As I already *had* XHTML (in jsfiddle, too) +1 for the "Browsers will treat XHTML served as text/html using HTML rules" part. – AndreKR Nov 19 '10 at 18:25
  • @AndreKR: It’s the MIME media type that matters, not the content. – Gumbo Nov 19 '10 at 18:37
  • What about HTML5? – Pedro Gimeno Jun 13 '18 at 00:59
  • @PedroGimeno — HTML 5 is HTML (except when it is XHTML). – Quentin Jun 13 '18 at 06:23
  • 1
    @Quentin, HTML 5 may be HTML, but [it is not SGML](https://stackoverflow.com/questions/16185880/html5-is-not-based-on-sgml-so-what-is-it-based-on-then) and therefore there's no DTD for it. Your reply references a DTD that does not exist for HTML5. That's why I'm asking. – Pedro Gimeno Jun 14 '18 at 13:51
  • @PedroGimeno — The rules are the same, even though the means of expressing them is different. – Quentin Jun 14 '18 at 13:54
  • I was kind of hoping a reference to a normative source that stated this. I've dug into it myself, it's interspersed in the syntax description but the gist of it is here: https://www.w3.org/TR/html50/syntax.html#script-data-state - note that the only special characters are NULL and `<`; the rest including `&` are emitted verbatim. Other states may apply after opening a ` – Pedro Gimeno Jun 20 '18 at 20:40
  • @PedroGimeno — Well yes. That's the point. Special interpretation of `&` happens *outside* scripts. – Quentin Jun 20 '18 at 20:46
15

Yes, the content model of STYLE and SCRIPT is special:

Although the STYLE and SCRIPT elements use CDATA for their data model, for these elements, CDATA must be handled differently by user agents. Markup and entities must be treated as raw text and passed to the application as is. The first occurrence of the character sequence "</" (end-tag open delimiter) is treated as terminating the end of the element's content. In valid documents, this would be the end tag for the element.

Gumbo
  • 643,351
  • 109
  • 780
  • 844
  • Ha! So, to have that character sequence in a JS string, you need to do something like: `'<' + '/'` or `'\u003C/'` – z0r Jul 15 '21 at 04:17