Extremely strange glitch in Chrome - parses contents of string!

Question

Okay - this is the dumbest glitch I have seen in a while:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<script type='text/javascript'>

var data = "</script>";

</script>
</head>
<body>

This should break!

</body>
</html>

This causes syntax errors because the JavaScript parser is actually reading the contents of the string. How stupid!

How can I put </script> in my code. Is there any way?

Is there a valid reason for this behavior?

What's wrong with escaping? It's not just Chrome. I tested it on my Firefox 3.6 on Windows 7, and it also gives the same incorrect result. — Xavier Ho, Apr 23 '10 at 06:20
@Xavier Ho: The results are correct, the expectations are wrong. — , Apr 23 '10 at 06:24
The point is that HTML doesn't define a separate script type inside which it parses strings, and in the strings HTML tags are invalid. When it sees `` - it doesn't care about what might or might not be strings inside the script. — , Apr 23 '10 at 06:31

score 5 · Accepted Answer · edited May 23 '17 at 11:48

Within X(HT)ML (when actually treated as such), scripts are required to be escaped as CDATA for precisely this reason. http://www.w3.org/TR/xhtml1/diffs.html#h-4.8

In XHTML, the script and style elements are declared as having #PCDATA content. As a result, < and & will be treated as the start of markup, and entities such as < and & will be recognized as entity references by the XML processor to < and & respectively. Wrapping the content of the script or style element within a CDATA marked section avoids the expansion of these entities.
<script type="text/javascript">
<![CDATA[
  ... unescaped script content ...
]]>
</script>

If your XHTML document is just served as text/html and treated as tag soup, that doesn't apply and you'll just have to "escape" the string like '</scr' + 'ipt>'.

CDATA... oh yeah. I forgot about that. I always did it with ActionScript. Forgot that you can do it with JavaScript... :) — Nathan Osman, Apr 23 '10 at 06:24
Ah. I get it. Oh well. The other solution you proposed seems to work alright. — Nathan Osman, Apr 23 '10 at 06:37

score 2 · Answer 2 · answered Apr 23 '10 at 06:27

It's not a glitch - this is normal expected behaviour and quite rightly so if you think about it. HTML specs do not define scripting languages, so all the engine should see is plain text up until </script>, which closes the tag. There are a couple of options, other than the ones already outlined:

// escape the / character, changing the format of the "closing" tag
var data = "<\/script>"; 

// break up the string
var data = "</"+"script>";

The first method works because HTML doesn't use \ for escaping, it's treated as a literal character, and of course <\/script> isn't a valid closing tag. The second one works for more obvious reasons, but I've been told by someone else here that it shouldn't be used (and I never quite understood why).

The second one seems to work okay. I don't know why I wouldn't use it either. — Nathan Osman, Apr 23 '10 at 06:32
@George: The first would be preferable to me anyway, it just doesn't feel right to break up two string literals ;-) — Andy E, Apr 23 '10 at 06:50

Samuel Edwin Ward · Answer 3 · 2012-11-06T16:32:24.477

If you can believe the HTML4 standard, the script content

ends at the first ETAGO ("</") delimiter followed by a name start character ([a-zA-Z])

So, the JavaScript parser is not reading the contents of the string as you describe; the JavaScript parser never gets anything after var data = ", which obviously isn't a valid script.

The simplest way to avoid accidentally ending your JavaScript early is to use Andy E's first suggestion:

var data = "<\/script>";

This way the HTML parser doesn't see </ so the script content doesn't end, and \/ is equivalent to / in a JavaScript string literal, so the results are correct. This is also the method shown for JavaScript in the standard.

score 0 · Answer 4 · answered Apr 23 '10 at 06:22

Write it this way:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<script type='text/javascript'>
<!--
var data = "</script>";
-->
</script>
</head>
<body>
This should break!
</body>
</html>

The reason is simply that HTML is parsed before executing javascript and the  make the parser ignore all tags that appear in this section.

What if your script contains `"-->"` in a string? ;-) – Andy E Apr 23 '10 at 06:40 — Andy E, Apr 23 '10 at 06:40

Extremely strange glitch in Chrome - parses contents of string!

4 Answers4