964

Are CDATA tags ever necessary in script tags and if so when?

In other words, when and where is this:

<script type="text/javascript">
//<![CDATA[
...code...
//]]>
</script>

preferable to this:

<script type="text/javascript">
...code...
</script>
Arsen Khachaturyan
  • 7,904
  • 4
  • 42
  • 42
brad
  • 73,826
  • 21
  • 73
  • 85
  • 21
    Now that XHTML is essentially dead, is this no longer a relevant concern? – allyourcode Jan 30 '12 at 21:49
  • 84
    @allyourcode: what makes you think XHTML is dead? HTML5? There's XHTML5 to go right along with it :) – Doktor J Feb 22 '12 at 14:42
  • 4
    @DoktorJ AFAIK xHTML was at version 1. It's HTML equivalent was version 4. There was an effort concentrated in xHTML 2.0 intending to push the xform, xlink, time and svg namespaces into the spec as a manner of improving the same features HTML 5 was adding - xform/input-validation, time/animations, svg/canvas - but efforts for the xHTML 2 spec were refocused towards the HTML 5 features. That's not to say that xHTML 2 was dropped or became obsolete but it's not planned in the near future. – Mihai Stancu Aug 03 '12 at 11:08
  • 14
    XHTML is not dead in Java Seam / JSF / Facelets development. – JoJo Aug 30 '12 at 20:03
  • 17
    @Mihai Stancu -- that is not entirely correct. According to W3C there is an [XML syntax for HTML5](http://www.w3.org/TR/html5-diff/#syntax): "The other syntax that can be used for HTML5 is XML. This syntax is compatible with XHTML1 documents and implementations. Documents using this syntax need to be served with an XML media type and elements need to be put in the http://www.w3.org/1999/xhtml namespace following the rules set forth by the XML specifications." – BrainSlugs83 Oct 11 '12 at 00:03
  • 3
    @allyourcode: Mate XHTML is the preferred now, and is not dead. HTML should be dead except messy coders wouldnt be able to work. – ekerner Aug 10 '13 at 05:31
  • @allyourcode my military customers in the UK have just upgraded from IE6 to IE8 after five years of shouting at them. If XHTML is dead then I need a big paddle to get me out of this creek. Auto-updating browsers are not used by many large corporations (sadly). – EvilDr Sep 17 '14 at 07:36
  • @EvilDr But IE8 does not support XHTML yet. When IE9 arrives, then you will be able to use XHTML! Yes! – Mr Lister Aug 06 '15 at 12:42
  • @DoktorJ I doubt XHTML5 will become popular. XHTML has outlived its usefulness IMO. –  Sep 29 '15 at 11:31
  • 1
    SVG is alive and well, and requires CDATA declarations for any internal ecmascript that includes < and &. – brennanyoung Apr 02 '19 at 14:03
  • To prove that XHTML is not dead, visit [this website](https://infoplasticsurgery.com/). It's a genuine real XHTML website(!) – Jack G Feb 29 '20 at 16:01

15 Answers15

604

A CDATA section is required if you need your document to parse as XML (e.g. when an XHTML page is interpreted as XML) and you want to be able to write literal i<10 and a && b instead of i&lt;10 and a &amp;&amp; b, as XHTML will parse the JavaScript code as parsed character data as opposed to character data by default. This is not an issue with scripts that are stored in external source files, but for any inline JavaScript in XHTML you will probably want to use a CDATA section.

Note that many XHTML pages were never intended to be parsed as XML in which case this will not be an issue.

For a good writeup on the subject, see https://web.archive.org/web/20140304083226/http://javascript.about.com/library/blxhtml.htm

Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
Michael Ridley
  • 10,378
  • 3
  • 22
  • 16
  • 52
    There's a lot more to it than just "validation". Most strict XML parsers won't go through the page if they hit an illegal character. It's more than simply about making W3C happy and getting green instead of red. – Loren Segal Sep 15 '08 at 20:58
  • 42
    If you avoid `&` and `<` characters, you don't need a CDATA section; it'll work fine in both HTML and XHTML. You can easily achieve this by putting all significant code in external scripts and just using inline scripts to eg. initialise variables (escaping `&`/`<` to `\x26`/`\x3C` in string literals if you need). – bobince Sep 20 '09 at 09:10
  • 24
    What about in the case of HTML5? – Mathew Attlee Dec 02 '09 at 11:44
  • 5
    @Mathew Attle - this is a good question. Be a great question to ask on a separate thread to ensure it gets the attention it needs. – Alex KeySmith Nov 06 '10 at 19:01
  • 3
    @Loren: Then it's still completely about validation. The extent to which a user-agent rejects invalid XML is orthogonal. – Lightness Races in Orbit Jun 09 '11 at 18:09
  • 3
    @chief: wrong, html5 contains a xhtml dialect enabled by serving it with xhtml mime type (for local files, that’s the .xhtml file ending) – flying sheep Jan 12 '12 at 22:52
  • what do you mean by parsed data vs. character data? – PositiveGuy Jun 04 '12 at 18:09
  • 2
    @Mathew Attlee: According to the W3C, [there is an XML syntax for HTML5](http://www.w3.org/TR/html5-diff/#syntax). When using the XML syntax, the case would be the same. – BrainSlugs83 Oct 11 '12 at 00:06
  • 2
    @CoffeeAddict character data is plain text. Something like `"` is used as is. Parsed data is parsed as HTML (or XML): `"` is interpreted as `"` and then the parser stops parsing because it encounters an end tag. – Mr Lister Apr 15 '14 at 11:01
  • @MathewAttlee this will not be necessary for HTML5. – Sanjib Debnath Apr 05 '18 at 06:53
  • 2
    The many comments about XHTML being dead are misleading because there are various other XML formats out there in common use, some of which are de facto parts of HTML5. For example, you still need CDATA for script elements inside SVG, because SVG is XML. – brennanyoung Apr 02 '19 at 14:11
  • From archived link to about.com: "_` – ruffin Jul 28 '20 at 19:01
238

When browsers treat the markup as XML:

<script>
<![CDATA[
    ...code...
]]>
</script>

When browsers treat the markup as HTML:

<script>
    ...code...
</script>

When browsers treat the markup as HTML and you want your XHTML 1.0 markup (for example) to validate.

<script>
//<![CDATA[
    ...code...
//]]>
</script>
Shadow2531
  • 11,980
  • 5
  • 35
  • 48
  • 15
    Just as a matter of code safety, it is better to surround your CDATAs with block comments `/* ... */` because otherwise if the line breaks are removed, the code will break – BryanH Nov 03 '15 at 20:05
  • shouldn't "...as XML" in first section be "...as non-interpreted text"? In http://stackoverflow.com/questions/2784183/what-does-cdata-in-xml-mean we see "...these strings includes data that _could_ be interpreted as XML markup, but should not be." – matt wilkie Jan 31 '17 at 20:38
  • @mattwilkie, What I mean with "as XML" is "When browsers use their XML parser (as opposed to the HTML parser) to parse the markup because the document was sent with an XML-based mime type or the file containing the markup has an XML-based file extension". – Shadow2531 Feb 01 '17 at 12:07
136

HTML

An HTML parser will treat everything between <script> and </script> as part of the script. Some implementations don't even need a correct closing tag; they stop script interpretation at "</", which is correct according to the specs.

Update In HTML5, and with current browsers, that is not the case anymore.

So, in HTML, this is not possible:

<script>
var x = '</script>';
alert(x)
</script>

A CDATA section has no effect at all. That's why you need to write

var x = '<' + '/script>'; // or
var x = '<\/script>';

or similar.

This also applies to XHTML files served as text/html. (Since IE does not support XML content types, this is mostly true.)

XML

In XML, different rules apply. Note that (non IE) browsers only use an XML parser if the XHMTL document is served with an XML content type.

To the XML parser, a script tag is no better than any other tag. Particularly, a script node may contain non-text child nodes, triggered by "<"; and a "&" sign denotes a character entity.

So, in XHTML, this is not possible:

<script>
if (a<b && c<d) {
    alert('Hooray');
}
</script>

To work around this, you can wrap the whole script in a CDATA section. This tells the parser: 'In this section, don't treat "<" and "&" as control characters.' To prevent the JavaScript engine from interpreting the "<![CDATA[" and "]]>" marks, you can wrap them in comments.

If your script does not contain any "<" or "&", you don't need a CDATA section anyway.

Ayo K
  • 1,719
  • 2
  • 22
  • 34
user123444555621
  • 148,182
  • 27
  • 114
  • 126
  • 2
    The statement “A CDATA section has no effect at all” is not true for (the proposed) HTML5, which recognizes the construct. http://www.w3.org/TR/html5/syntax.html#cdata-sections – danorton Mar 06 '12 at 20:29
  • 3
    @danorton Interesting. I think that's a pretty ugly mix. Still no effect in script content though. – user123444555621 Mar 06 '12 at 22:06
  • Wow, I thought having scripts as PCDATA in XHTML was cumbersome and pointless; wasn't thinking of what would happen if "" is used if it was CDATA like in HTML4. I love XHTML :) – Hawken May 12 '12 at 19:38
  • http://diveintohtml5.info/past.html#xhtml Apparently not all XHTML is made (or served) equal. – jinglesthula May 18 '12 at 18:55
  • 2
    Did not know that _any_ `` inside script tags is bad. – Salman A Mar 05 '13 at 18:38
  • 3
    @SalmanA That's one of HTML's odditys and officially called *ETAGO*. Learn more: http://mathiasbynens.be/notes/etago (while the article states that no browser ever implemented that feature, I'm pretty sure it caused some trouble for me. Maybe in some other tool) – user123444555621 Mar 05 '13 at 20:35
  • 1
    Actually I ran into validation problems -- `` fails to validate but after reading your answer and changing to `` fixed it. – Salman A Mar 05 '13 at 21:52
32

Basically it is to allow to write a document that is both XHTML and HTML. The problem is that within XHTML, the XML parser will interpret the &,<,> characters in the script tag and cause XML parsing error. So, you can write your JavaScript with entities, e.g.:

if (a &gt; b) alert('hello world');

But this is impractical. The bigger problem is that if you read the page in HTML, the tag script is considered CDATA 'by default', and such JavaScript will not run. Therefore, if you want the same page to be OK both using XHTML and HTML parsers, you need to enclose the script tag in CDATA element in XHTML, but NOT to enclose it in HTML.

This trick marks the start of a CDATA element as a JavaScript comment; in HTML the JavaScript parser ignores the CDATA tag (it's a comment). In XHTML, the XML parser (which is run before the JavaScript) detects it and treats the rest until end of CDATA as CDATA.

Chris Middleton
  • 5,654
  • 5
  • 31
  • 68
ondra
  • 9,122
  • 1
  • 25
  • 34
24

It's an X(HT)ML thing. When you use symbols like < and > within the JavaScript, e.g. for comparing two integers, this would have to be parsed like XML, thus they would mark as a beginning or end of a tag.

The CDATA means that the following lines (everything up unto the ]]> is not XML and thus should not be parsed that way.

Franz
  • 11,353
  • 8
  • 48
  • 70
18

Do not use CDATA in HTML4 but you should use CDATA in XHTML and must use CDATA in XML if you have unescaped symbols like < and >.

Loren Segal
  • 3,251
  • 1
  • 28
  • 29
  • 11
    CDATA is not valid in HTML4. Simply put, it's not part of the grammar. CDATA is a syntax of XML, and XHTML is an XML subset. Therefore it should only be used inside XML (and its subsets). HTML on the other hand is not XML. – Loren Segal Aug 30 '10 at 03:59
17

It to ensure that XHTML validation works correctly when you have JavaScript embedded in your page, rather than externally referenced.

XHTML requires that your page strictly conform to XML markup requirements. Since JavaScript may contain characters with special meaning, you must wrap it in CDATA to ensure that validation does not flag it as malformed.

With HTML pages on the web you can just include the required JavaScript between and tags. When you validate the HTML on your web page the JavaScript content is considered to be CDATA (character data) that is therefore ignored by the validator. The same is not true if you follow the more recent XHTML standards in setting up your web page. With XHTML the code between the script tags is considered to be PCDATA (parsed character data) which is therefore processed by the validator.

Because of this, you can't just include JavaScript between the script tags on your page without 'breaking' your web page (at least as far as the validator is concerned).

You can learn more about CDATA here, and more about XHTML here.

informatik01
  • 16,038
  • 10
  • 74
  • 104
LBushkin
  • 129,300
  • 32
  • 216
  • 265
11

CDATA indicates that the contents within are not XML.

Here is an explanation on wikipedia

Andre Lombaard
  • 6,985
  • 13
  • 55
  • 96
Alex Beardsley
  • 20,988
  • 15
  • 52
  • 67
9

When you are going for strict XHTML compliance, you need the CDATA so less than and ampersands are not flagged as invalid characters.

Chris Shaffer
  • 32,199
  • 5
  • 49
  • 61
8

to avoid xml errors during xhtml validation.

gehsekky
  • 3,187
  • 2
  • 21
  • 14
8

CDATA tells the browser to display the text as is and not to render it as an HTML.

Ikaso
  • 2,268
  • 19
  • 26
5

CDATA is necessary in any XML dialect, because text within an XML node is treated as a child element before being evaluated as JavaScript. This is also the reason why JSLint complains about the < character in regexes.

References

Community
  • 1
  • 1
Paul Sweatte
  • 24,148
  • 7
  • 127
  • 265
5

CDATA indicates that the contents within are not XML.

Jim
  • 69
  • 1
  • 1
2

That way older browser don't parse the Javascript code and the page doesn't break.

Backwards compatability. Gotta love it.

Tyler Carter
  • 60,743
  • 20
  • 130
  • 150
2

When you want it to validate (in XML/XHTML - thanks, Loren Segal).

Community
  • 1
  • 1
ceejayoz
  • 176,543
  • 40
  • 303
  • 368