462

The reason for this "escapes" me.

JSON escapes the forward slash, so a hash {a: "a/b/c"} is serialized as {"a":"a\/b\/c"} instead of {"a":"a/b/c"}.

Why?

Jason S
  • 184,598
  • 164
  • 608
  • 970
  • 4
    FWIW I've never seen forward slashes escaped in JSON, I just noticed it with the Java library at http://code.google.com/p/json-simple/ – Jason S Oct 16 '09 at 22:29
  • 35
    PHP's `json_encode()` escapes forward slashes by default, but has the `JSON_UNESCAPED_SLASHES` option starting from PHP 5.4.0 (March 2012) – Walter Tross Jul 01 '12 at 19:52
  • 12
    Here's a PHP code that will not escape every slash, only in `''`: `echo str_replace('', '<\/', json_encode($obj, JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES));` – rustyx Jan 20 '13 at 13:52
  • Does the code include the '': or does it start at echo? Because starting at echo fails for me. I simply dont get anything. Yes I replaced my $obj for my variable :) – marciokoko Jul 08 '13 at 14:58
  • 1
    JSON doesn't escape or serialize anything... your JSON serializer does. Which one are you using? – Lightness Races in Orbit May 11 '17 at 16:31

5 Answers5

359

JSON doesn't require you to do that, it allows you to do that. It also allows you to use "\u0061" for "A", but it's not required, like Harold L points out:

The JSON spec says you CAN escape forward slash, but you don't have to.

Harold L answered Oct 16 '09 at 21:59

Allowing \/ helps when embedding JSON in a <script> tag, which doesn't allow </ inside strings, like Seb points out:

This is because HTML does not allow a string inside a <script> tag to contain </, so in case that substring's there, you should escape every forward slash.

Seb answered Oct 16 '09 at 22:00

Some of Microsoft's ASP.NET Ajax/JSON API's use this loophole to add extra information, e.g., a datetime will be sent as "\/Date(milliseconds)\/". (Yuck)

Bergi
  • 630,263
  • 148
  • 957
  • 1,375
Ruben
  • 15,217
  • 2
  • 35
  • 45
  • 1
    Thanks for the answer. Never thought of that edge case. They should escape instances of with <\/, but not escape all the other slashes. :/ – Jason S Oct 16 '09 at 22:15
  • 4
    That would be a good thing, escaping just – Ruben Oct 16 '09 at 22:20
  • 1
    yeah, the hoops people have gone through for HTML... this is now the 2nd recent surprise for me re: JSON. The other one was that Infinity and NaN are not serialized. http://stackoverflow.com/questions/1423081/ – Jason S Oct 16 '09 at 22:25
  • JSON conversion is useful for generating inline scripts, eg. `var f = <= xxx.to_json %>;`. It should definitely not escape *all* forward slashes--it makes every JSON-encoded URL longer, instead of just the rare edge case. – Glenn Maynard Nov 30 '10 at 00:08
  • 8
    See this blog post for the rationale for the ASP.NET JSON date format: http://weblogs.asp.net/bleroy/archive/2008/01/18/dates-and-json.aspx – Michiel van Oosterhout Dec 18 '11 at 21:51
  • 1
    why doesn't it just escape the ( ) character pair instead of all forward slashes ( / ) than ? –  Feb 20 '12 at 16:27
  • 1
    @GuyMontag: Probably because it is slightly more efficient / easier to implement when you don't have to remember which characters you have seen before to decide when to output an escape sequence. This way it's a simple per character substitution. – Ruben Feb 21 '12 at 19:03
  • 2
    ...the only characters that need to be escaped in an encoding mechanism are the special characters used in the encoding mechanism structure itself( for JSON that would be ", {,},[,], etc.)...all other characters are payload and should be treated as such....if you break html because you send the wrong characters it is not the "structured data's encoding mechanism's responsibility to fix this....JSON needs to be replaced....it should be agnostic to client side language, server side language, and application, it is a payload delivery mechanism. –  Jun 01 '12 at 21:12
  • 33
    JSON needs to be replaced because a particular implementation of a JSON serializer outputs some JSON that (**while being entirely valid JSON**) has some extra characters so it can also be dropped into an HTML script element as a JS literal?! That isn't so much throwing the baby out with the bathwater as throwing the baby out because someone bought him a set of water wings. – Quentin Jun 01 '12 at 22:53
  • 1
    I think instead of “Yuck”, the hack of using an escaped forward slash to mark a string as being more than a string is neat and much better than alternatives such as iterating through a deserialized JSON object and converting everything that matches a `RegExp` for ISO 8601 to a `Date` object or needing a separate key to indicate whether the serialized value is a pure string or a `Date`. – binki May 19 '15 at 18:58
  • 26
    What I don't get, is why a JSON serializer would even care where the JSON ends up. On a web page, in an HTTP request, whatever. Let the final renderer do additional encoding, if it needs it. – Dan Ross Apr 28 '16 at 05:11
  • 8
    @DanRoss And it can. Escaping `/` is not *required*, it is *allowed*, to ease the use of JSON. If you don't want to escape `/`, then don't. – Andreas May 06 '16 at 20:56
  • *"when embedding JSON in a ` (perhaps with spaces in there) that terminates them. – T.J. Crowder May 11 '17 at 14:15
  • 3
    @T.J.Crowder The HTML 3.2 and 4.01 specs explicitly forbid `` inside ``? You could (should?) interpret this as `
    `. As this is how it should be parsed if you'd change `` (and other magic incantations) just to be absolutely sure the content of a script tag was not misinterpreted.
    – Ruben May 12 '17 at 23:16
  • @T.J.Crowder And JSON (2000) predates the HTML5 spec (2014). But these days, you're right. And it's just a lot faster to just escape all occurrences of ``, without having to look ahead for `script`, so why bother. – Ruben May 12 '17 at 23:22
  • 1
    @Ruben: HTML 3.2 is ancient history, but where do you see that in even the HTML4 spec? In any case, the HTML5 spec codified what browsers had actually been doing for years. Separately: What makes you think replacing `/` is faster than replacing, say, ``? It would at least minimize bloat. But in any case, it's handling the issue at the wrong level (JSON rather than the point at which you're using JSON in a `script` tag, **if** it happens you are). – T.J. Crowder May 13 '17 at 11:25
  • @T.J.Crowder Section B.3.2 "Specifying non-HTML data" specifically deals with this issue. Secondly, handling `` is marginally slower than looking for just `/` because you need track what the previous character was. Blindly replacing `/` is just simpler, and that's what happens for all the reserved characters too. Besides, the JSON spec still allows you to *not* escape `/`. So if you don't like the bloat: use/write a JSON formatter that doesn't escape `/`. No-one is forcing you to do either way, as it's an optional encoding feature. – Ruben May 13 '17 at 20:08
  • @Ruben: Thanks for finding that for me, I always like to have that kind of arcane info. :-) I suspect you'll find if you actually test it that PHP will replace `` just as fast as `/`. But it's not particularly important, it's still handling it at the wrong level. – T.J. Crowder May 14 '17 at 07:55
  • @Ruben nowadays we use the even more complicated `` form, which, using a CDATA, escapes (hah!) this problem as well. (To the curious: `` is the matching “proper” escape for CSS, and both are also XHTML/1.1 clean.) – mirabilos Mar 28 '18 at 03:56
  • 1
    **Warning** users should not rely on this feature of `json_encode()` and should properly escape JSON embedded in an HTML document to avoid XSS. See [Rule #3.1](https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html#rule-31---html-escape-json-values-in-an-html-context-and-read-the-data-with-jsonparse) of the OWASP cheatsheet. – jchook Oct 12 '19 at 22:05
  • @jchook Rule 3.1 that you mention seems to indicate that `<` and `>` are unsafe characters, and `<\/script>` is thus inadequate encoding to prevent XSS. Are you aware of an exploit for this, as I can’t yet see how it might be exploited. – Simon East Mar 20 '21 at 01:31
  • @SimonEast See [this post](https://forums.phpfreaks.com/topic/294115-json_encode-is-not-a-security-feature-or-how-to-pass-php-values-to-javascript/) for examples. e.g. 1. it can be disabled at runtime or removed as default in a future version of PHP. 2. html entities, e.g. " instead of ". 3. Text encoding assumed to be UTF-8 by json_encode – jchook Mar 21 '21 at 01:47
50

The JSON spec says you CAN escape forward slash, but you don't have to. A reverse solidus must be escaped, but you do not need to escape a solidus. Section 9 says

"All characters may be placed within the quotation marks except for the characters that must be escaped: quotation mark (U+0022), reverse solidus (U+005C), and the control characters U+0000 to U+001F."

StayOnTarget
  • 11,743
  • 10
  • 52
  • 81
Harold L
  • 5,166
  • 28
  • 28
23

PHP escapes forward slashes by default which is probably why this appears so commonly. I suspect it's because embedding the string "</script>" inside a <script> tag is considered unsafe.

Example:

<script>
var searchData = <?= json_encode(['searchTerm' => $_GET['search'], ...]) ?>;
// Do something else with the data...
</script>

Based on this code, an attacker could append this to the page's URL:

?search=</script> <some attack code here>

Which, if PHP's protection was not in place, would produce the following HTML:

<script>
var searchData = {"searchTerm":"</script> <some attack code here>"};
...
</script>

Even though the closing script tag is inside a string, it will cause many (most?) browsers to exit the script tag and interpret the items following as valid HTML.

With PHP's protection in place, it will appear instead like this, which will NOT break out of the script tag:

<script>
var searchData = {"searchTerm":"<\/script> <some attack code here>"};
...
</script>

This functionality can be disabled by passing in the JSON_UNESCAPED_SLASHES flag but most developers will not use this since the original result is already valid JSON.

Simon East
  • 55,742
  • 17
  • 139
  • 133
  • 3
    "*is considered unsafe*" -> it really is unsafe. Exploit: `";` Try it, the bodies will alert the floor rather than getting a variable called 'the' with script tags in its value. You can say "then don't embed it in a page", yeah, that's a possible workaround, but a lot of people do this anyway (so let's just make good escape functions because why not) and frankly I understand their point: it would make sense if it were safe to have JSON data with correctly escaped data values in JavaScript. – Luc Mar 19 '21 at 13:44
  • 2
    Thanks @Luc - great example of why PHP has opted to escape slashes by default! Functions should be secure by default, and only insecure when you specifically want it that way. – Simon East Mar 20 '21 at 01:12
  • I beg to differ. PHP shouldn't encode forward slashes by default. If a frontend developer want to echo user inputted value into HTML code, he should realize that it is always very dangerous, whether it is inside – Daniel Wu Aug 23 '23 at 04:01
  • @DanielWu Unfortunately many developers are lazy, which is why “secure by default” is a good strategy. Developers can disable those additional slashes by adding the extra parameter if they understand the consequences. (Also I don’t think `htmlspecialchars()` will work in this scenario. The slashes are still required.) – Simon East Aug 26 '23 at 00:56
22

I asked the same question some time ago and had to answer it myself. Here's what I came up with:

It seems, my first thought [that it comes from its JavaScript roots] was correct.

'\/' === '/' in JavaScript, and JSON is valid JavaScript. However, why are the other ignored escapes (like \z) not allowed in JSON?

The key for this was reading http://www.cs.tut.fi/~jkorpela/www/revsol.html, followed by http://www.w3.org/TR/html4/appendix/notes.html#h-B.3.2. The feature of the slash escape allows JSON to be embedded in HTML (as SGML) and XML.

hakre
  • 193,403
  • 52
  • 435
  • 836
Boldewyn
  • 81,211
  • 44
  • 156
  • 212
  • 7
    A structured data payload delivery mechanism should not be tied to language constructs..as this may change in the future...but this might explain the design decisions if there were any of the JSON creators. –  Jun 01 '12 at 21:11
  • 2
    '\/' === '/' So I don't need to unescape forward slashes when receiving my jsonp? – Timmetje Feb 07 '13 at 09:30
1

Yes, some JSON utiltiy libraries do it for various good but mostly legacy reasons. But then they should also offer something like setEscapeForwardSlashAlways method to set this behaviour OFF.

In Java, org.codehaus.jettison.json.JSONObject does offer a method called

setEscapeForwardSlashAlways(boolean escapeForwardSlashAlways)

to switch this default behaviour off.

Pratap Singh
  • 401
  • 1
  • 4
  • 14