18

In HTML, tags and entities aren't parsed within <script> tags, and </ immediately ends the tag. Thus,

<script><b>fun &amp; things</

will give you a script tag with the exact contents <b>fun &amp; things.

If you're including JSON and you want to include the characters </ in your script, then you can replace it with <\/ because the only place for those characters to appear is in a string, and \/ is an escape sequence that turns into a single forward slash.

However, if you're not using JavaScript, then this trick doesn't work. In my case specifically I'm trying to insert a <script type="math/tex"> into the source so that MathJax will process it. Is there a way to escape </ in the original HTML source? (I don't have a particular need for </ but I'm writing a generic tool and want to make it possible to use any text.)

(It's possible to create the script tag in JavaScript and populate its innerText, but I'm working with the raw HTML so I can't do that.)

Sophie Alpert
  • 139,698
  • 36
  • 220
  • 238
  • 4
    Can't you use `<`? It is not clear to me what you are trying to do here. – Oded Feb 08 '13 at 20:33
  • So, basically, you're trying to encode any HTML between `` tags without encoding the actual script tags themselves? – sbeliv01 Feb 08 '13 at 20:37
  • 4
    @Oded: No, due to how script tags work, if you use `<` you'll actually get the four characters `&`, `l`, `t`, `;`. – Sophie Alpert Feb 08 '13 at 21:28
  • 1
    Can you clarify the question and what you are trying to achieve? It is really difficult to understand what you want to do here and why it is important to have the sequence ``. – Oded Feb 08 '13 at 21:30
  • @Oded: Sorry, perhaps it's a bit clearer now. – Sophie Alpert Feb 08 '13 at 21:47
  • If you are trying to escape `` in JavaScript code so it can be safely embedded in html between `` tags you should replace `` with `` or ``. It's safer to do because if you replace it with `<\/script>` you might break JavaScript code like this: `var q = -1 – Kamil Szot Jun 01 '14 at 19:05
  • Possible duplicate of [How to align content of a div to the bottom?](https://stackoverflow.com/questions/585945/how-to-align-content-of-a-div-to-the-bottom) – imz -- Ivan Zakharyaschev Jun 15 '17 at 18:59

4 Answers4

26

I came here looking for a way to universally escape </script> inside the JavaScript code.

After bit of research I figured that if you are trying to escape </script> in JavaScript code so it can be safely embedded in html between <script> and </script> tags you should replace </script with </scr\ipt. It's safer to do because if you replace it with <\/script you might break JavaScript code like this: var q = -1</script/.test("script");

Out of all the s,c,r,i,p,t letters only \i has no special meaning in string and regexp literals in JS so that's the only character that you can safely replace in </script to transform valid JS code so that it doesn't change its meaning.

Be careful not to look for </script> but rather </script because </script asdasdas> will end your script just as well as </script> does.

Sorry, it doesn't help OP in any way. Accepted answer is absolutely correct that you need to know what constructs are legal in language you have inside your <script></script> to know how to escape </script> occurrence without braking the code.

Kamil Szot
  • 17,436
  • 6
  • 62
  • 65
10

In HTML, as opposite to XHTML, the content of a script element is processed as plain text except for the occurrence of an end tag, so that </ ends processing and must, in conforming documents, start the end tag </script>. There is no general mechanism to avoid this. Any methods that circumvent this feature are unavoidably dependent on the “language” used inside the element. The word “language” is in quotes here, because the content can be just about anything, as long as your code can parse and process it.

So: no general mechanism, but for content other than JavaScript or some of the few other client-side scripting languages recognized by some browsers, you can make your own rules.

Jukka K. Korpela
  • 195,524
  • 37
  • 270
  • 390
  • 3
    Isn't Unicode `\u003c`-like encoding inside a script tag an universal solution? It's implemented for JSON in Rails and Flask (implementation links at the bottom of my question: http://stackoverflow.com/q/39193510/518169) – hyperknot Aug 28 '16 at 23:28
9

The HTML specification explains in detail what is allowed and how to securely escape content. Especially considering HTML's history, this is a non-trivial task.

From the HTML specification:

The easiest and safest way to avoid the rather strange restrictions described in this section is to always escape "&lt;!--" as "&lt;\!--", "&lt;script" as "&lt;\script", and "&lt;/script" as "&lt;\/script" when these sequences appear in literals in scripts (e.g., in strings, regular expressions, or comments), and to avoid writing code that uses such constructs in expressions. Doing so avoids the pitfalls that the restrictions in this section are prone to triggering: namely, that, for historical reasons, parsing of script blocks in HTML is a strange and exotic practice that acts unintuitively in the face of these sequences.

Source: https://www.w3.org/TR/html52/semantics-scripting.html#restrictions-for-contents-of-script-elements

TommyMason
  • 495
  • 4
  • 13
-2

More HTML encoding might help? &lt; for the <.

Difficult to know quite what you are doing with it. If you are not sure of what the content between the script tags might be (looks like you might be trying to use it as a template holder of some sort?) then you could/should use a CDATA section:

<script><![CDATA[<b>fun &amp; things</b>]]></script>

That should do it. More description could help give a better answer too :)

n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
Pete Duncanson
  • 3,208
  • 2
  • 25
  • 35